You are on page 1of 21

John Benjamins Publishing Company

This is a contribution from Prosody and Iconicity.


Edited by Sylvie Hancil and Daniel Hirst.
2013. John Benjamins Publishing Company
This electronic file may not be altered in any way.
The author(s) of this article is/are permitted to use this PDF file to generate printed copies to
be used by way of offprints, for their personal use only.
Permission is granted by the publishers to post this file on a closed server which is accessible
to members (students and staff) only of the authors/s institute, it is not permitted to post
this PDF on the open internet.
For any other use of this material prior written permission should be obtained from the
publishers or through the Copyright Clearance Center (for USA: www.copyright.com).
Please contact rights@benjamins.nl or consult our website: www.benjamins.com
Tables of Contents, abstracts and guidelines are available at www.benjamins.com
Iconic interpretation of rhythm in speech

Tea Prir & Anne Catherine Simon


Universit de Genve / Universit catholique de Louvain

Approaches to iconicity are most often related to fundamental frequency


(Ohala 1984). This article examines to what extent rhythm in speech is perceived
and interpreted iconically. Iconic rhythmic patterns that imitate part of the
lexical or syntactic content have no codified meaning in the system of a language.
On the other hand, scansion or variation in speech rate and tempo may act as
contextualization cues (Gumperz 1992). We propose to distinguish between three
types of rhythmic iconicity: iconicity on a local (word or phrase) and on a global
(utterance or sequence of utterances) level, and contextualization provoked by
contrast between sequences.

1. Introduction

Rhythm, in our view, is not thought of as a property of speech itself, but as a


construction in the listeners mind derived from some kind of repetition of events
in time. This contribution tackles the issue of how the perception of a rhythmic
pattern (Auer et al. 1999:23) contributes iconically to the meaning of speech. The
two main rhythmic phenomena under investigation are the following: isochrony
creating rhythmic scansions and variation in speech rate and tempo.
Perception of rhythmic isochrony rests on prominent syllables that are regu-
larly spaced over time and perceived as rhythmically patterned. Prominence is the
property by which linguistic units are perceived as standing out from their environ-
ment (Terken 1991:1768). Pitch movement, syllable lengthening or increase in
loudness may contribute to syllabic prominence. Intonation, duration and accentu-
ation are therefore considered as part of rhythm since they contribute to rhythmic
patterns. As it has been set out by Auer et al. we use scansion
to refer to particularly marked rhythmical sequences in oral conversation
discourse. These are produced in a highly salient, rhythmically regular pattern
for instance by lining up a number of phonetically strongly marked primary
lexical stresses in a continued series of isochronously recurring beats. They
appear as ordered and marked stretches of speech. (Auer et al. 1999:153)

2013. John Benjamins Publishing Company


All rights reserved
Tea Prir & Anne Catherine Simon

Rhythmic scansions have a specific tempo, that is, they develop with a given num-
ber of stressed (prominent) syllables per unit of time. The shorter the time inter-
val between two stressed syllables, the faster the tempo. The ratio of stressed to
unstressed syllables the density can create a more or less emphasised speech
style. Finally, we pay attention to the variations in speech rate, i.e. the number of
syllables per unit of time (most often, per second).
It is commonly recalled that rhythmic scansions or speech rate do not have
fixed signalling value as grammar does although they have at their base an
iconic value (Auer et al. 1999:153). Our main research questions are the following:

How does the rhythmic organisation in speech reflect time or movements


from everyday life, in iconic relationship?
How does grammar1 coexist with iconicity in language use, and how do they
interact and complete each other in creating meaning?

We first discuss some established approaches to phonetic and prosodic iconicity


(Section2). Those approaches have in common the priority given to fundamental
frequency (F0) or articulatory phonetics in the study of iconicity. In Section3, we
give an overview of the contribution of iconicity to the construction of mean-
ing. This leads us to propose a model for the iconic interpretation of rhythm. The
remainder of the contribution analyses and discusses some examples with differ-
ent kinds and degrees of iconicity related to rhythm. All the examples come from
a corpus of French radio press reviews (Prir to appear), of which we analysed the
most salient rhythmic modifications. By definition, radio press review is an assem-
blage of reported speech and personal commentaries by the journalist; therefore
prosody and rhythm are extensively used to contextualise other voices in dis-
course and express the speakers stances. The speech of the radio journalist we
study can be characterised as very expressive; for that reason, it shows a great deal
of prosodic variation.

2. Phonetic iconicity, a brief state of the art

Research on prosodic iconicity mostly focused on intonational phenomena. In an


ethologic perspective, Ohala (1984) compared the F0 of voice in different linguis-
tic communities. He showed how the F0 variation could be interpreted iconically

. The most conventional part of language.

2013. John Benjamins Publishing Company


All rights reserved
Iconic interpretation of rhythm in speech

and not phonologically. For example, [] high F0 signifies (broadly) smallness,


nonthreatening attitude, desirous of the goodwill of the receiver, etc. and low F0
conveys largeness, threat, self-confidence and self-sufficiency (Ohala 1984:14).
Gussenhoven (2002) systematised Ohalas findings on frequency code and
extended them to the effort and the production codes. The effort code is respon-
sible for the effects of hyper- and hypoarticulation, as well as of the F0 amplitude
movements. The production code is related to the respiration phases.
One generally distinguishes between four types of iconicity,2 according to the
degree of relatedness between the linguistic form and the form of the designated
referent. According to Hinton, Nichols & Ohala (1994), iconicity, or sound sym-
bolism, can be imitative, corporeal, synesthetic or conventional. Fischer (1999)
adopts a classification very close3 to the former, since iconicity is defined as audi-
tory, articulatory or associative. Fnagy (1999) does not propose any classification,
but considers that iconicity is ruled by three principles: reproduction of symp-
toms, correlation between speech organs and bodily gestures and isomorphism of
expression and contents. We briefly define and compare those types.
Auditory, or imitative, iconicity is known under the name of onomatopoeia:
natural sounds or noises are imitated by speech sounds (animal noise like miaow
(lexical) for the cat; or nature noise like shshsh (non-lexical) for the wind or the
wave). Auditory refers to the fact that the speaker tries to reproduce a sound that
he previously heard.
Articulatory, or corporeal, iconicity is inherent to the production of certain
sounds, i.e. to the spatial position of speech organs. For example, the vowel /i/
is characterised by a high tongue position, which we may interpret as smallness
(of the opening between tongue and palate) or as nearness (of tongue and palate)
(Fischer 1999:126). Following Hinton et al. (1994) corporeal iconicity is mostly
related to emotional aspects of language. Studies of the expression of emotions
(Fnagy 1983) showed that F0 and rhythm variations differ according to whether
emotions are active or passive. Active emotions involve excitation, such as happi-
ness or anger: the F0 range is higher and the speech rate is speeded up. Conversely,
passive emotions involving depression, such as sadness or indifference, lower the
F0 and slow down the speech rate.

. Some authors employ the term of sound symbolism (see for example Hinton et al. 1994).
. Fischer (1999:131133) uses the term (phonological) iconicity instead of (sound) sym-
bolism. He wants to avoid any confusion with Peirces distinction between symbol (arbitrary
and conventional) and icon (motivated, like: image, diagram or metaphor).

2013. John Benjamins Publishing Company


All rights reserved
Tea Prir & Anne Catherine Simon

Table 1 recapitulates the continuum between direct associations (like in imitative


and corporeal iconicity) and indirect associations.

Table 1. Iconicity classification


Direct association - Indirect association

Sound symbolism
Imitative Corporeal Synesthetic Conventional
Hinton et al. (1994)
Iconicity
Fischer (1999) Auditory Articulatory Associative I Associative II

Fnagy (1983, 1999) Principles of iconicity:


1. Reproduction of symptoms
2. Correlation between speech organs and bodily gestures
3. Quantitative isomorphism of expression and content

Continuum expressed in Table 1 indicates that the associations between signi-


fier and signified can be more or less direct (towards iconic) or indirect (towards
conventional). Articulatory iconicity, together with the auditory iconicity, are called
primaries because they perform a more or less direct form-meaning relationship.
On the other side, associative iconicity is called secondary for it is more abstract,
less immediate, and context dependent. Typical examples for associative iconicity
are phonesthemes4 such as the sound cluster -ash associated with violence and/or
speed (clash, rush, splash in English).
Fnagy (1999) puts forward three principles governing voice quality of sounds
and expressive oral gesturing. The first principle is about the reproduction of some
physical symptoms such as pharyngeal contraction related to nausea, which may
express attitudes such as dislike, contempt or hatred. According to the second
principle, the movement of the speech organ (lips or tongue) can signal the state
of a whole body and thus becomes a symbolic (allusive) gesture: for instance, the
anticipation of a kiss in a tender lip rounding [] (1999:8). A third principle
governing vocal gesturing is quantitative isomorphism of expression and con-
tent: different degrees of intensity, height and duration reflect different degrees of
excitement or semantic intensity. (Fnagy 1999:9)
As mentioned in the introduction, we restrict ourselves to the analysis of
iconicity in rhythm. The third principle mentioned by Fnagy that is, quanti-
tative isomorphism between expression and content is applicable to the study

. [These] submorphemic meaning-carrying entities are sometimes called phonesthemes,


or phonetic intensives (Bolinger 1965). While phonesthemes are often conventional, some have
universal properties (Hinton et al. 1994:5).

2013. John Benjamins Publishing Company


All rights reserved
Iconic interpretation of rhythm in speech

of rhythm, since we hypothesise that strong modification in rhythm, as well as


rhythmic scansions, can quite directly relate to the contents expressed by a speaker.

3. Iconicity, rhythm and the construction of meaning in speech

Crystal defines iconicity by saying that individual sounds are thought to reflect
properties of the world, and thus have meaning (Crystal 1987:174). As soon as
we consider the iconic value of rhythm (and not of individual vowels or conso-
nants), this definition has to be adapted. Furthermore, we consider how iconicity
functions in interactive speech as opposed to its meaning in the language system
(see for example the phonesthemes illustrated in Section1).
In the line of interactional research on interpretation, we consider that indi-
viduals engaged in a verbal encounter do not just rely on literal or denotational
meaning to interpret what they hear (Gumperz 2001:126). More often than not,
they build inferences on their expectations about what is to follow, on cultur-
ally specific background knowledge, and on particular cues like intonation and
r hythmic patterns.
We think that rhythm in speech contributes to the construction of mean-
ing in interaction and we try to explain how, by looking over three proposals,
namely the dual encoding model (Fnagy 1993, 1999), the experiential approach
to discourse interpretation (Auchlin 2008) and the contextualisation process
(Gumperz 1992, 2001).
Fnagy assumes a twofold encoding of any utterance, first grammatical and
then iconic. Any utterance generated by the Grammar goes through the so-called
Modulateur (modulator or distorter), which grafts secondary messages on the
utterance (Figure 1). Iconicity belongs to that second coding.

Modulateur
Rcepteur
Emetteur

e
air
m

Message Message
m
ra

primaire complete
G


Figure 1. Model of double coding by Fnagy (1983:229)

The Distorter (or Modulator) ensures the completeness of a messages mean-


ing, since it enables speaker and listener to access the emotive and social meaning
of speech. Fnagy claims that iconicity is a property of language that explains the

2013. John Benjamins Publishing Company


All rights reserved
Tea Prir & Anne Catherine Simon

personal style of individual speaking. According to Fnagys definition of iconic-


ity, motivated is equal to iconic: All signs are conventional by definition, and
may be more or less motivated (iconic) (1999:10).
Auchlins observations are in keeping with Fnagy but he does not draw a
clear-cut separation between a first and a second coding for each utterance.

Les prosodies rendent accessibles, sensibles, de nombreuses informations


relatives au parleur, son tat motivationnel, vis--vis du fait de parler ( X) ou du
contenu voquer (devant X) []. En cela, les prosodies invitent re-considrer
le sens linguistique laune de son exprience temporelle, dans la dure de son
laboration: les efforts voco-prosodiques semblent essentiellement consacrs
faire merger une exprience de sens partage. (Auchlin 2008:5)5

In other words, vocal and prosodic variations (of intonation, register or rhythm)
have a significant influence on how one will experience a speech event. Prosody
is interpreted according to its temporal synchronisation (or desynchronisation)
with other communicative aspects (such as turn-taking, utterance segmenta-
tion, emphasis, etc.). Prosody is iconic in that it evokes the degree of effort the
speaker puts in its speech production and the way he converges (or not) with its
co-interactant (see Auer et al. 1990; Auchlin & Simon 2004).
Although Gumperz notion of contextualisation has more to do with indexi-
cality than with iconicity, its contribution to the understanding of iconic meaning
is essential.

I use the term contextualization to refer to speakers and listeners use of


verbal and nonverbal signs to relate what is said at any one time and in any one
place to knowledge acquired through past experience, in order to retrieve the
presuppositions they must rely on to maintain conversational involvement and
assess what is intended. (Gumperz 1992:230)

According to Gumperz, interpretation relies on inference in a situated context.


The contextualisation cues are features of the surface of the message, the function
of which is to foreground a given sequence (a syllable, a word, an utterance, a turn
at talk) by creating contrasts.

. Prosodies make available and sensitive, numerous pieces of information related to the
speaker, to his motivational state concerning the fact that he speaks (to X), or the subject
he evokes (in front of X) [] By this, prosodies invite to reconsider the linguistic meaning,
taking into account the temporal experience, i.e. the time needed for the meaning elaboration.
The vocal-prosodic efforts seem to be essentially dedicated to cause the emergence of shared
meaning experience. (our translation).

2013. John Benjamins Publishing Company


All rights reserved
Iconic interpretation of rhythm in speech

Intonation by its very nature nonreferential, gradient and evocative is seen as


a prime contextualization-cue in this approach. Yet intonation in the restrictive
sense of pitch configuration rarely functions alone to cue an interpretive
frame. The same frame may be cued by timing and volume as well.
(Couper-Kuhlen 2003:16)

From what precedes, we retain the following key-ideas that connect rhythm to
iconic meaning:

1. Iconic meaning is strongly related to imitation (or mirroring) while contex-


tualisation helps draw inferences by creating contrasts. In the remainder of
this contribution, rhythm is interpreted as iconic if it is possible to establish
a resemblance between the rhythmic form and a referent; it is interpreted as
contextualisation cues if the rhythm seems to activate the inference, without
having a clearly identified iconic meaning.
2. Variations in speech rate or tempo (acceleration or deceleration) can express
temporal notions, like speed (e.g. agitation) as opposed to slowness (e.g. calm).
3. The occurrence of a rhythmic scansion also has a meaning potential as pointed
out by Auer et al. (1999) within a contextualisation approach:
Perseveration, when the beat goes on and on, may hint at a more general and
more elementary meaning potential which can be associated naturally and
iconically with the isochronous recurrence of beats in marked instances; it may
be a prosodic metaphor of the unfailing reliability and predictability of next
events in time. For this reason it is a particularly appropriate contextualization
cue for thetic acts (i.e. affirming, insisting on a point and ensuring the credibility
and reliability of statements). (Auer et al. 1999:158)

4. The span of rhythmic phenomena may vary from a very local focus (on a
single word) to a global one (when foregrounding an utterance or a sequence).

In the next sections, we discuss the three following types of rhythmic iconicity:
iconicity created by global (Section 4) or local (Section 5) rhythmic phenom-
ena and rhythm interpreted as a contextualisation cue, with an iconic meaning
potential (Section6).

4. Iconic rhythm at a global level

Rhythm phenomena such as variation of speech rate or scansion, sometimes com-


bined with declination of F0, need to be observed on a global span of speech, as
they spread over a sequence of utterances. The sequence perceived as iconic is
often salient and emerges from the surrounding cotext.

2013. John Benjamins Publishing Company


All rights reserved
Tea Prir & Anne Catherine Simon

Example (1) is rhythmically characterised by a scansion creating an effect of


staccato (literally with syllables detached from each other), accompanied by an
effect of reduction that can be defined as a diminuendo movement in the musi-
cal language. The perception of the rhythmic scansion comes from both a high
accentual density6 and perceived isochrony. Each line of transcription corre-
sponds to a prosodic group; isochrony is represented by the spacing of the slash
bars around each group.7

(1) lecteurs (voters)


/ il arrive / it happens
/ que les lecteurs /
that the voters
/ prfrent / prefer
/ ceux / those
/ qui les laissent / who let them
/ dormir / sleep

The measurement of temporal intervals between stressed syllables shows a rela-


tive isochrony:8 the perception of the scansion derives from a significant length-
ening of the final syllable in the first 4 units, as well as by long pauses between
them. Prosodic groups are marked by a bridge accent (arc accentuel, Fnagy
1980) i.e. a prosodic pattern in which the first and the last syllables of a seg-
ment are accented. In this case, there is a combination of an accent on the initial
syllable of each group (il arrive, que les lecteurs) and a marked final syllable
lengthening (ilarrive, que les lecteurs). The purpose of this accentual schema
is to focus on the semantic contents and coherence of an utterance. In this way,
bridge accents reinforce the segmentation of speech into short clear-cut units,9
creating emphasis.

. That is, a high ratio of stressed to unstressed syllables, stressed syllables being indicated by
a primary accent () within the transcription. In this example, density ranges from 1/1 (ceux)
to 2/5 (que les lecteurs), with a mean ratio of 1/2.5.
. Nonproportionalfont is used for analyzing rhythmically regular passages. We
adopt the convention from Auer et al. (1999:ixxi), as it has been adapted for French by Simon
and Grobet (2005:4). Temporal intervals are delimited by a slash (/). The closer they are, the
shorter the time interval from one stressed syllable to the other.
. Experiments reported by Couper-Kuhlen (1993:24) and Auer et al. (1999:5154) suggest
that a difference in duration of 20% between two temporal intervals can be tolerated without
disturbing the perception of isochrony.
. Resulting in a perception of 6 well-defined groups: (il arrive) (que les lecteurs) (prfrent)
(ceux) (qui les laissent) (dormir).

2013. John Benjamins Publishing Company


All rights reserved
Iconic interpretation of rhythm in speech

The reduction effect (diminuendo) results from a gradual diminution of the


duration of the final vowels; the pauses become also shorter as the utterance develops
(pause duration is indicated after each unit in the Prosogram, see Figure 2). Finally,
the falling contours on each final accent syllable create a global falling movement
(with F0 being a bit lower at each occurrence, as in a declination line, see Figure 2).
Rhythm and intonation combine in reflecting the image of somebody
who is falling asleep (as if ones moments of wakefulness were getting shorter).
Nevertheless, once arrived at the last item dormir (to sleep), the journalist puts
a strong initial accent and unstresses the last syllable, creating a strong contrast
with the former rhythmic organisation. Contrary to what one might expect, the
stressed vowel [] is pronounced with a hoarse voice10 and evokes irritation,
which is unfavourable to falling asleep.

i la riv k le ze lk t pre fr
il arrive 521 ms que les lecteurs 524ms prfrent 369m
TVB (578.38581.58s) Prosogram v2.4e

s ki le ls dr mi m
369m ceux 310ms qui les laissent 160ms dormir mais
TVB (578.58584.78s) Prosogram v2.4e

Figure 2. Prosogram11 of Example (1). Vertical arrows indicate the perceptual-centres of


the syllable (at the onset of the vowel), that is, the moment where the beat is perceived. The
reduced space between arrows indicates that the tempo goes faster

. Prosogram does not provide information about voice quality variation because the vocal
folds vibration is not regular enough to produce F0.
. Prosogram (Mertens 2004) is a stylised transcription of prosody based on the processing
of time, intensity and F0. Prosogram displays the F0 contour calculated by Praat software
(Boersma & Weeninck 2009) as discrete sequences of stylised pitch (thick lines). The styl-
isation is an estimation of pitch contour by human listeners based on perception studies.
The evolution of speech over time is represented on the top of Prosogram: from one vertical
mark to another, there is a 100 milliseconds (ms) span. For instance, in Prosogram 1, the first
pause duration is 521ms and the duration of preceding syllable [iv] is approximately 400 ms.
Prosogram displays that the syllable [iv] has falling pitch contour.

2013. John Benjamins Publishing Company


All rights reserved
Tea Prir & Anne Catherine Simon

What is iconic in this passage? On the one hand, one can notice the rhythmic
and intonative imitation of falling asleep, with a marked intonation of finality
(falling movement). On the other hand, one can simultaneously perceive the voice
quality signalling speakers irritation about the fact that voters are actually put to
sleep. Therefore, we simultaneously have the iconic mimic of a situation (voters
falling asleep) and information about the attitude of the speaker (who is critical of
this state of affairs).
Example (2) illustrates iconicity conveyed by means of variation in speech rate.
The speech rate of three consecutive discourse segments displaying great variation
and contrast was measured (number of syllables by second) and compared.
(2) draper12 (to lose control)
a. 
porter le fer dans la plaie et investiguer
toutcrin [4.9syl/sec]
 turning the knife in the wound and to
investigate
unduly
b. 
quitte draper quelques fois [6.8syl/sec]
even if he lost control from time to time
c. 
est devenu oui limperator le matre absolu
dujournal [4syl/sec]
 he became yes the emperor the supreme master of
the newspaper

In this example, the journalist is commenting on the attitude of the editor in chief
of the French newspaper Le Monde, who is quitting his position. Segment (b) is
characterised by an increase of 40% in speech rate when compared to segment (a).
It stands out by an acceleration of the speech rate and by a lower intensity [65dB]
with respect to the average intensity (72dB) of the two surrounding segments. We
attribute an iconic value to this accelerated speech rate: the verb draper implies
a loss of control and is semantically related to speed: if someone loses control
(drape) it is often because of a high speed.
From the syntactical viewpoint, the fast segment is an interpolated clause. In
our corpus, interpolated clauses are usually realised with a lower register and a
faster speech rate,13 its purpose being to provide the quotation source. This is not
the case here: the interpolated clause is actually part of the quotation and functions
to characterise the quoted person as someone rapid (thereby, likely to do mistakes).

. The French verb draper is most often used in a moving machine context, for unexpected
(out of control, side) motion.
. See Gachet and Avanzi (2008) for the review of prosodic features of interpolated clause
in French.

2013. John Benjamins Publishing Company


All rights reserved
Iconic interpretation of rhythm in speech

The third example of iconic rhythm at a global level implies rhythmic slowing
down. The speaker mimics the discourse of an aged man in his seventies, namely
the French President Jacques Chirac.14

(3) septuagnaire (septuagenarian)


a. 
il avait choisi de faire lire un message par le
hros du jour mais sans doute pas cette vitesse
Jacques Chirac y faisait l loge de la vitalit
de Nicolas Sarkozy [6syl/sec]
 he chose to let the hero of the day read the
message but not at this speed Jacques Chirac was
celebrating the vitality of Nicolas Sarkozy
b. 
ce mot admirable sous la plume dun s eptuagnaire
rsumait bien lamvi lambivalence de ses
sentiments du grand art [3.9syl/sec]
 these words of admiration written by a
septuagenarian summed up well the amvi ambivalence
of his feelings of high art

Fast speech rate (6syl/sec) in segment (a) is followed by a significant slowing down
(3.9syl/sec) in segment (b). The contrast between the two segments is responsible
for perceiving the second one as very slow.
A closer look at vowel duration and quality explains their contribution to the
global perception of an alteration in rhythm. Apart from the salient variation of
the speech rate, the semantic and phonetic contents of 4 words vitesse (speed),
vitalit (vitality), admirable (admirable) and art (art) lead to the interpre-
tation of the first and second segments respectively as vivacious/dynamic/rapid
versus passive/old/slow. The first two items contain a front close vowel [i] that is
associated (Ohala 1984; Fnagy 1983) with smallness, shrillness and rapid move-
ments. The first [i] of vitalit is 6ms long and the [i] of vitesse is 8ms, which is much
shorter than the duration of the [a] vowel in art or admirable. The second [a] of
admirable lasts for 20ms and the one of art takes up 30ms. The openness of the
vowel [a] is associated with largeness, gentleness, and with slow and heavy move-
ments. The voice quality of the vowels is modified as well: the first [i] of vitalit
is voiceless but nevertheless invested with a large amount of articulatory energy
while the [a] of art is pronounced with a creaky voice that signals a decrease in
breath and energy.

. The corpus contains data from 2004: at that time, Jacques Chirac was the French President
and Nicolas Sarkozy his Minister of Domestic Affairs.

2013. John Benjamins Publishing Company


All rights reserved
Tea Prir & Anne Catherine Simon

Together, these elements speech rate, vowel duration and quality iconically
represent the contrast between two opposite persons: Nicolas Sarkozy as some-
one full of energy and Jacques Chirac as someone who runs at half speed, who is
exhausted.
This illustrates how rhythmic phenomena (acceleration or deceleration of
speech rate or reduction resulting from joint temporal and intonative phenomena)
can receive an iconic interpretation when the temporal movement mimics a part
of the lexical content of a word or an utterance.

5. Iconic rhythm at a local level

The two following examples illustrate rhythmic iconicity located on a single word
or a phrase whose sound feature imitates an object from everyday life. We will see
that rhythm can be obtained in a different way, not only by speech rate as in the
previous examples, but also by repetition of the same melodic pattern.
The syntagm norme vague (huge wave) in Example (4) is what Fnagy
(1999:14) calls self-representation of linguistic units: it is a description and
mimesis of its proper content at the same time. The rhythmic pronunciation of
the items norme vague (huge wave) and gonfle (growing bigger) iconically
represents threatening.
(4) norme vague (huge wave)
que lnorme vague qui gonfle en Asie
that the huge wave that is growing in Asia

k l e n  rm v a g k i g f l n a z i
que l norme vague qui gonfle en Asie
PONT (453,32456,02s) Prosogram v2.4e

Figure 3. Prosogram of Example (4). Grey lines highlight the iconic melody (shaped like a
wave) on the phrases norme vague and gonfle en Asie

The perception of a regular rhythm is the result of repetition of a melodic


contour shaped by ample and dynamic tones (see Figure 3) and of the keeping
of the number of syllables within each interval constant (5 syllables). The first
pitch contour rises and falls just like a wave spreading over the phrase norme
vague; the peak is aligned to the vowel [] of norme, after which the contour is
falling. The second contour on qui gonfle on Asie has a peak on the nasal vowel

2013. John Benjamins Publishing Company


All rights reserved
Iconic interpretation of rhythm in speech

[] (24ms); this movement symbolises the rounding inherent to the signification


of the word gonfle (blow up). Like the vowel [a] in the example septuagnaire,
[] and [] are associated with largeness; in this example they symbolise growing
and expansion.
Example (5) assaut is to be compared with Example (3) septuagnaire since
both have in common the effect of contrast. Contrast in Example (5) is obtained by
a strong accentuation and a variation of F0 register. The journalist recalls two epi-
sodes in the relationship between Jean-Marie Colombani and Edwy Plenel: their
meeting and alliance followed by their divorce.

(5) assaut (assault)


a. le dmocrate chrtien Jean-Marie Colombani
et le trotskiste culturel Edwy Plenel
 the Christian Democrat Jean-Marie Colombani
and the Cultural Trotskyist Edwy Plenel
b. ont pris Le-Monde d AS-SAUT
ils taient AL-LIES
 scaled Le Monde they formed an alliance
c. / sils divorcent aujourd hui /223ms (12%)
 / cest parce que les ventes du journal
baissent /232ms (4%)
 / et que baissent aussi les recettes
publicitaires /285ms (22%)
 if they divorce today it is because the sales
of the newspaper come down as well asthe
advertising receipts

This example is rhythmically sequenced in three parts (ac). Segments (a)


and (b) are characterised by an extended F0 register (64220Hz) and by high
pitched onsets on the four initial accents (in bold letters in the transcription).
The initial accents on first names Jean and Edwy are followed by a steep F0
declination that extends over the complete names Jean-Marie Colombani and
Edwy Plenel (see rectangles in Figure 4). Then, initial accents on assaut and
allis are completed with final accents, creating two parallel bridge accents (arc
accentuels) on both words (see semicircles in Figure 4). The phonetic structure
of assaut two lengthened vowels [a] (120ms) and [o] (241ms) separated by a
voiceless fricative [s] allows a break in the middle of the word. We can say that
the pronunciation of assaut is iconic in that it enacts a sudden jump. There is a
phonetic parallelism with allis (vowel-consonant-vowel), as well as a prosodic
parallelism (both are pronounced with a similar melodic pattern). The last bit
of the sequence, (c), contains only final accents that establish isochrony by

2013. John Benjamins Publishing Company


All rights reserved
Tea Prir & Anne Catherine Simon

regular time intervals15 with a narrowed F0 register (66175Hz). The contrast


between the first two segments and (c) is reinforced by the iconic representa-
tions of their central items assaut and baissent (to lower).
On the one hand, the initial accents symbolise the strength and the energy in
an alliance; on the other hand, the final accents symbolise ending of the alliance
and separation of Jean-Marie Colombani and Edwy Plenel.

a
w
l de m kat ke tj  ma i k l ba ni
le dmocrate chrtien Jean Marie Colombani
TVB (76.2279.22s) Prosogram v2.4e

ni e l tots kis t kyl ty l


ombani 730ms et le trotskiste culturel
TVB (76.2282.22s) Prosogram v2.4e

b
l  dwi pl nl pi l m da so
ulturel Edwy Plenel 364ms ont pris le monde d assaut 884ms
TVB (82.2285.22s) Prosogram v2.4e

c
il ze t ta li je sil di v
884 ms ils taient allis 579ms s ils divorcent
TVB (85.2288.22s) Prosogram v2.4e

v so u di s pas k le v dy u nal bs


divorcent aujourdhui 657ms c est parce que les ventes du journal baissent
TVB (88.2291.22s) Prosogram v2.4e

bs e k bs to si le s py bli si t


baissent et que baissent aussi les recettes publicitaires
TVB (91.2294.22s) Prosogram v2.4e

Figure 4. Prosogram of Example (5). Letters (a, b, c) mark the beginnings of the three
segments. Rectangles indicate the F0 declination, semicircles indicate the bridge accent
andvertical arrows indicate final accents

. See the duration between successive beats, as indicated in the transcription (in ms).

2013. John Benjamins Publishing Company


All rights reserved
Iconic interpretation of rhythm in speech

Disparate prosodic cues (duration, accentuation, F0 variation) are used together


for creating reiteration of contour (bridge accent and F0 declination), dynamic tone
and final syllable extension; they have in common repetition that organises the
whole sequence. In the next section, we will show how prosody can go together
with lexical repetition for creating iconicity.

6. Rhythm as a contextualisation cue

Contextualisation cues do not have a clearly identified symbolism, but they bring
the listener to draw inferences about what is said. Those cues are related simulta-
neously to the signalling of reported speech and to the signalling of the speakers
attitude.
In Example (6) soixante-deux pour cent (sixty two percent) the variation of
speech rate and accentuation is the cue for understanding the relationship between
the radio journalist and the written text that he is quoting. Sixty two percent is
the result of an opinion poll in which French citizens were asked whether they
would approve the European Constitution. The percentage is repeated three times,
each occurrence having its own prosodic and rhythmic colour.

(6) soixante-deux pour cent (sixty two percent)


a. / soixante-deux pour cent /
/ soixante-deux pour cent /
/ des socialistes /
/ sympathisants / [4.2syll/sec]
 sixty two percent sixty two percent of socialists
sympathisers
b. 
les sympathisants socialistes pas des militants
qui vont voter le premier dcembre [6.2syll/sec]
 the sympathisers socialists not the militants that
will vote December the first
c. 
soixante-deux pour cent sont favorable la
constitution europene [5.5syll/sec]
 sixty two percent are favourable to the European
Constitution

Segment (a) has a speech rate of 4.2syll/sec; the rhythm is regular with a progres-
sive decrease in interval duration. The regularity of the rhythm is also ensured by
a repetition of soixante-deux pour cent, with a short but unusual extension of the
schwa vowel [] in soixante [swast] and by the insistence accent on deux. In seg-
ment (b), speech rate accelerates by 47% [6.2syll/sec] and immediately decelerates
by 10% [5.5syll/sec] on the last segment.

2013. John Benjamins Publishing Company


All rights reserved
Tea Prir & Anne Catherine Simon

a
s wa s t d p ur s s wa s t d p urs
soixante deux pour cent soixante deux pour cent
TVB (596.9.3599.93s) Prosogram v2.4e

s d e s c s j a l i s t s  p a t i z
cent 461ms des socialistes 350ms sympathisants 191msles
TVB (599.93602.93s) Prosogram v2.4e

b
le s  p a t i z s c s j a l i s t p a d e m i l i t k iv v c t e l p rm
es sympathisants socialistes pas des militants 268ms quivont voter leremier
TVB (602.93605.93s) Prosogram v2.4e

mjed e s b s w a s t d p ur s s f a v c r a
premier dcembre 283ms soixante deux pour cent sont favorables
TVB (605.93608.93s) Prosogram v2.4e

Figure 5. Prosogram of Example (6). Vertical arrows indicate accented syllables responsible
for a regular rhythm perceived on the segment (a)

The fastest segment (b) is an interpolated clause (like in the example draper)
integrated in the quotation. The interpolated clause is a reformulation of the
quotations content and is realised by juxtaposition: les sympathisants socialistes
pas des militants. This reformulation makes clear who the quotation is talking
about. The syntactic parallelism of reformulation (repetition and juxtaposition)
is superimposed by the voice and prosody that indicates that the speaker is agi-
tated about the result of opinion poll. The syntactic parallelism accompanied by
the vocal comment creates what we call prosodic reformulation.16 It adds a
further meaning to the utterance. In other words, reformulation is enhanced by
prosody.
Example (7) illustrates a switch from the introductory part of the reported
speech to the quoted direct speech. The speech rate on the introductory part
is fast [7.6syll/sec] and slows down on the reported direct speech [4.5syll/sec].
The quotation refers to a legal text about the reduction of working time from 40
to 35 hours per week in France that raised an animated discussion in the media
in 2004.

. The concept of prosodic reformulation was developed in Prir (2010).

2013. John Benjamins Publishing Company


All rights reserved
Iconic interpretation of rhythm in speech

(7) notre texte (our text)


a. et le Figaro de citer euh Patrick Ollier
 qui est prsident UMP de la commission des
affaires conomiques de lAssemble qui dit

[7.6syl/sec]
 and the Figaro cite Patrick Ollier the president
of UMP of the commission of Economic Affaires
of the Assembly who says
b. mais notre texte
notre texte naura aucun
 aucun caractre
contraignant [4.5syl/sec]
 but our text our text will not have any
restrictive character

Speech rate variation is the most salient prosodic phenomenon. Nevertheless,


fundamental frequency and intensity participate in the contextualisation of the
quotation. The speech rate on segment (a) is particularly rapid [7.6syll/sec], the

a
e l fi ga rod si te pa tri k lje ki pre zi d y m pe
et le Figaro de citer euh Patrick Ollier quiest prsident UMP
notretxt (02.70s) Prosogram v2.4e

b
pe dla k mi sj de za f re k n mik d la s ble ki di
UMP de la commission des affaires conomiques de l Assemble qui dit 572ms
notretxt (2.705.40s) Prosogram v2.4e

m n tr tk st n tr tk st n ra o k
72 mais notre texte notre texte n' aura aucun
notretxt (5.408.10s) Prosogram v2.4e

o k ka rak tr k tr 
aucun aucun caractre contraignant
notretxt (8.1010.80s) Prosogram v2.4e

Figure 6. Prosogram of Example (7). Rectangles indicate the vowel lengthening and dynamic
tones on aucun

2013. John Benjamins Publishing Company


All rights reserved
Tea Prir & Anne Catherine Simon

mean F0 value is 90Hz and the F0 register ranges from 60 to 224Hz; the mean
intensity is 69dB. Segment (b) is 40% slower [4.5syll/sec] with a much higher
F0 mean (207Hz) and an expansion of the F0 register (66368Hz); the intensity
increases slightly (5dB).
The repetition (notre texte notre texte aucun aucun) is associated with
insistence, with ample and dynamic tones on aucun [ok] and extra-lengthening
of the syllable [k] (447ms and 550 ms). The vocal-prosodic posture the lower
extended vowel [] refers iconically to largeness, meaning big absence
because of the semantic content of aucun (not any). Moreover, the utterance
starts with an argumentative mais (but) and the prosody of the repeated seg-
ment aucun aucun supplements and completes the argumentation. The varia-
tion in voice quality indicates that the speaker takes over the voice of the quoted
person, whether he imitates his physical features or not. For these reasons, the
vocal and prosodic representation of the quoted person can be a cue for irony.
As a matter of fact, the imitator often seeks for parody and caricature, and this
reveals his viewpoint.

7. C
 onclusion

Rhythm, associated with other prosodic phenomena such as intonation, can


receive an iconic interpretation. Our analysis highlights the fact that the iconic
interpretation of a prosodic phenomenon can be realised at a local (word or
phrase) or at a global level (utterance or sequence of utterances). However,
there are some cases where the local and global levels coexist and complete each
other (Example (1) lecteurs). We claim that the local repetition of the melodic
contours, the initial accents and/or final ones, the vowel extension or even
pauses, contribute to a global perception of a discourse segment as rhythmically
organised.
We observed two discourse functions. The first one is related to the commen-
tary: a comment that iconically characterises the quoted person or that expresses
the position (stance) of the journalist. The involvement and concern of the speaker
can be measured by the prosodic parameters creating emphasis.
The second discourse function inherent to most radio speeches relates
to the maximal expression, by the journalist, of the semantic content of words.
This means that the radio journalist profits of every occasion to represent a seg-
ment iconically. This way the discourse becomes more attractive for the listeners
because of the analogy between sound form and content.

2013. John Benjamins Publishing Company


All rights reserved
Iconic interpretation of rhythm in speech

References

Auchlin, A. 2008. Du phonostyle lethos, les prosodies comme interfaces entre sens et corps.
Paper presented at the 3rd International Symposium on Discourse Analysis: Emotions,
Ethos and Argumentation, April 14, in Belo Horizonte, Brazil.
Auchlin, A. & Simon, A.C. 2004. Gabarits prosodiques, empathie(s) et attitudes. Cahiers de
lInstitut de Linguistique de Louvain 30(13): 181206.
Auer, P., Couper-Kuhlen E. & Di Luzio A. 1990. Isochrony and uncomfortable moments in
conversation. In Learning, Keeping and Using Language, Vol. 2, M.A.K. Halliday, J.Gibbons
& H. Nicholas (eds), 269281. Amsterdam: John Benjamins.
Auer, P., Couper-Kuhlen E. & Mller F. 1999. Language in Time. The Rhythm and Tempo of
Spoken Interaction. Oxford: OUP.
Boersma, P. & Weeninck D. 2009. Praat. Doing Phonetics by Computer (Version 5.1.20). http://
www.praat.org/ (31 October 2009).
Bolinger, D. 1965. The atomination of meaning. Language 41: 555573.
Couper-Kuhlen, E. 1993. English Speech Rhythm: Form and Function in Everyday Verbal Interac-
tion [Pragmatics & Beyond New Series 25]. Amsterdam: John Benjamins.
Couper-Kuhlen, E. 2003.Intonation and discourse: Current views from within. In The Hand-
book of Discourse Analysis, D. Schiffrin, D. Tannen & H. Ehernberger Hamilton (eds),
1334. Oxford: Blackwell.
Crystal, D. 1987. Sound symbolism. In The Cambridge Encyclopedia of Language, 174175.
Cambridge: CUP.
Fischer, A. 1999. What, if anything, is phonological iconicity? In Form Miming Meaning
[Iconicity in Language and Literature 1], M. Nnny & O. Fischer (eds), 123134. Amsterdam:
John Benjamins.
Fnagy, I. 1980. Laccent franais: Accent probabilitaire. In Laccent en franais contemporain,
I. Fnagy & P.R. Lon (eds). Studia Phonetica 15: 123233.
Fnagy, I. 1983. La vive voix. Essais de psycho-phontique. Paris: Payot.
Fnagy, I. 1999. Why iconicity? In From Miming Meaning [Iconicity in Language and
Literature 1], M. Nnny & O. Fischer (eds), 336. Amsterdam: John Benjamins.
Gachet, F. & Avanzi M. 2008. La prosodie des parenthses en franais spontan. Verbum 31(1):
5384.
Gumperz, J.J. 1992. Contextualization and understanding. In Rethinking Context: Language as
an Interactive Phenomenon, A. Duranti & C. Goodwin (eds), 229252. Cambridge: CUP.
Gumperz, J.J. 2001. Inference. In Key Terms in Language and Culture, A. Duranti (ed.), 126128.
Oxford: Blackwell.
Gussenhoven, C. 2002. Intonation and interpretation: Phonetics and phonology. In Speech
Prosody 2002: Proceedings of the First International Conference on Speech Prosody,
4757. A ix-en-Provence: ProSig and Universit de Provence Laboratoire Parole et
Langage.
Hinton, L., Nichols J. & Ohala J.J. (eds). 1994. Sound Symbolism. Cambridge: CUP.
Mertens, P. 2004. The prosogram: Semi-automatic transcription of prosody based on a tonal
perception model. In Proceedings of Speech Prosody 2004, B. Bel & I. Marlien (eds). http://
bach.arts.kuleuven.be/pmertens/prosogram/

2013. John Benjamins Publishing Company


All rights reserved
Tea Prir & Anne Catherine Simon

Ohala, J.J. 1984. An ethological perspective on common cross Language utilization of F0 of


voice. Phonetica 41: 116.
Prir, T. 2010. Lapport de la prosodie la reformulation et la rptition lors du passage de lcrit
loral. In Actes du XXVe Congrs International de Linguistique et de Philologie Romanes,
Tome IV, M. Iliescu, H.M. Siller-Runggaldier & P. Danler (eds), 527534. Berlin: Walter
de Gruyter.
Prir (to appear). Oral/crit daus lmergence de la mmore auditive partagee. In Text structuring.
Across the line of Speech and writing variation [Corpora and Language in Use Series no 2],
C. Bolly & L. Degand (eds). Louvain-la-Neure: Presses Universitaires de Louvain.
Simon, A.C. & Grobet, A. 2005. Interprtation des scansions rythmiques en franais. In Actes du
colloque Interface Discours Prosodie, Aix-en-Provence.
Terken, J. 1991. Fundamental frequency and perceived prominence. Journal of the Acoustical
Society of America 89: 17681776.

2013. John Benjamins Publishing Company


All rights reserved

You might also like