Phono

The Emergence of Phonology
How well have classic ideas on whole-word phonology stood the test of time?
Waterson claimed that each child has a system of their own; Ferguson and
Farwell emphasized the relative accuracy of rst words; Menn noted the
occurrence of regression and the emergence of phonological systematicity.
This volume brings together classic texts such as these with current data-rich
studies of British and American English, Arabic, Brazilian Portuguese,
Finnish, French, Japanese, Polish, and Spanish. This combination of classic
and contemporary work from the last thirty years presents the reader with
cutting-edge perspectives on child language by linking historical approaches
with current ideas such as exemplar theory and usage-based phonology and
contrasting state-of-the-art perspectives from developmental psychology and
linguistics. This is a valuable resource for cognitive scientists, developmen-
talists, linguists, psychologists, speech scientists, and therapists interested in
understanding how children begin to use language without the benet of
language-specic innate knowledge.
marilyn m. vihman is Professor of Language and Linguistic Science at the
University of York.
tamar keren-portnoy is Lecturer in Language and Linguistic Science at
the University of York.
The Emergence of Phonology:
Whole-word Approaches and
Cross-linguistic Evidence
Edited by
Marilyn M. Vihman and Tamar Keren-Portnoy
University Printing House, Cambridge CB2 8BS, United Kingdom
Published in the United States of America by Cambridge University Press, NewYork
Cambridge University Press is part of the University of Cambridge.
It furthers the Universitys mission by disseminating knowledge in the pursuit of
education, learning, and research at the highest international levels of excellence.
www.cambridge.org
Information on this title: www.cambridge.org/9780521762342
Cambridge University Press 2013
This publication is in copyright. Subject to statutory exception
and to the provisions of relevant collective licensing agreements,
no reproduction of any part may take place without the written
permission of Cambridge University Press.
First published 2013
Printed in the United Kingdom by CPI Group Ltd, Croydon CR0 4YY
A catalog record for this publication is available from the British Library.
Library of Congress Cataloging in Publication Data
The emergence of phonology : whole-word approaches and cross-linguistic
evidence / Edited by Marilyn M. Vihman and Tamar Keren-Portnoy
pages cm
Includes bibliographical references and index.
ISBN 978-0-521-76234-2
1. Lexical phonology. 2. Grammar, Comparative and general
Phonology. 3. Grammar, Comparative and general Morphology.
4. Reading Language experience approach. 5. Language and languages
Study and teaching. 6. Visual learning. 7. Complexity
(Linguistics) I. Vihman, Marilyn May, editor of compilation.
P217.6.E43 2013
414dc23 2013013107
ISBN 978-0-521-76234-2 Hardback
Cambridge University Press has no responsibility for the persistence or accuracy of
URLs for external or third-party internet websites referred to in this publication,
and does not guarantee that any content on such websites is, or will remain,
accurate or appropriate.
To the memory of Fergie
Charles A. Ferguson (19211998)
who mentored a generation of child phonologists,
including many whose work is included here
Contents
List of contributors page ix
Preface xi
1 Introduction: the emergence of phonology: whole-word
approaches, cross-linguistic evidence 1
mari lyn m. vi hman and tamar keren- portnoy
Part I The current framework 15
2 Phonological development: toward a radical templatic
phonology 17
mari lyn m. vi hman and wi lli am croft
Part II Setting papers 59
3 Child phonology: a prosodic view 61
natali e waterson
4 Words and sounds in early language acquisition 93
charles a. ferguson and carol b. farwell
5 Developmental reorganization of phonology: a hierarchy of basic
units of acquisition 133
marlys a. macken
6 Development of articulatory, phonetic, and phonological
capabilities 168
li se menn
Part III Cross-linguistic studies 215
7 One idiosyncratic strategy in the acquisition of phonology 217
t. m. s. pri estly
8 Phonological reorganization: a case study 238
mari lyn m. vi hman and shelley l. velleman
vii
9 How abstract is child phonology? Towards an integration of
linguistic and psychological approaches 259
mari lyn m. vi hman, shelley l. velleman,
and lorrai ne mccune
10 Beyond early words: word template development in Brazilian
Portuguese 291
dani ela oli vei ra- gui mara es
11 Templates in French 317
sophi e wauqui er and naomi yamaguchi
12 The acquisition of consonant clusters in Polish: a case study 343
marta szreder
13 Geminate template: a model for rst Finnish words 362
tuula savi nai nen- makkonen
14 Inuence of geminate structure on early Arabic templatic patterns 374
ghada khattab and j alal al- tami mi
15 Lexical frequency effects on phonological development: the case
of word production in Japanese 415
mi tsuhi ko ota
Part IV Perspectives and challenges 439
16 Aview from developmental psychology 441
lorrai ne mccune
17 Challenges to theories, charges to a model: the Linked-Attractor
model of phonological development 460
li se menn, ellen schmi dt, and brent ni cholas
References for reprinted papers 503
Index 504
viii Contents
Contributors
j alal al- tami mi
wi lli am croft
carol b. farwell
charles a. ferguson
tamar keren- portnoy
ghada khattab
marlys a. macken
lorrai ne mccune
li se menn
brent ni cholas
dani ela oli vei ra- gui mara es
mi tsuhi ko ota
t. m. s. pri estly
tuula savi nai nen- makkonen
ellen schmi dt
marta szreder
shelley l. velleman
mari lyn m. vi hman
natali e waterson
sophi e wauqui er
naomi yamaguchi
ix
Preface
The idea of creating this volume of readings was rst conceived ve years ago.
We would like to thank our editor, Helen Barton, for her willingness to take
the project on and her patience in seeing it through what turned out to be a far
longer incubation period than any of us anticipated! We also thank the living
authors of reprinted papers for their permission to include their inspiring work
in this volume, and also the University of Wisconsin for providing nancial
support for our reprint of Macken (1979). Finally, we thank Dr. Nina Gram
Garmann, University of Oslo and Akershus University College of Applied
Sciences, for providing insightful critiques of two of the papers published
here for the rst time.
Every effort has been made to secure necessary permissions to reproduce
copyright material in this work, though in some cases it has proved impossible
to trace or contact copyright holders. If any omissions are brought to our notice,
we will be happy to include appropriate acknowledgements on reprinting, and/
or in any subsequent edition.
xi
1 Introduction: the emergence of phonology:
whole-word approaches, cross-linguistic evidence
Marilyn M. Vihman and Tamar Keren-Portnoy
Whole-word phonology is a particular approach to early phonological develop-
ment. This volume is designed to bring together the classic papers which gave
rise to it in the 1970s and current studies that build on and extend the model,
which in essence took an emergentist and usage-based stance before its time;
the book will make no attempt to cover other approaches to phonological
development in any systematic way. Many of the papers, including Vihman
and Croft (2007, this volume, Chapter 2),
1
with which we begin, use the term
template to refer to child-specic word patterns identiable within the rst
year of word use. Templates, referred to sporadically in the earlier developmen-
tal literature (e.g., Menn 1983, this volume, Chapter 6) and given formal status
for adult linguistic analyses in Prosodic Morphology (McCarthy and Prince
1995), are a more focused expression of the ideas formulated by Waterson
(1971, this volume, Chapter 3), Ferguson and Farwell (1975, this volume,
Chapter 4), and Macken (1979, this volume, Chapter 5), which provided the
core of the whole-word phonology idea (see Vihman and Croft 2007, this
volume, Chapter 2, for a summary of the basic arguments).
This volume is restricted to the study of early word production and the phono-
logical patterning that can be seen in that domain. The year in which the rst of our
setting papers was written Waterson (1971, this volume, Chapter 3) also
marks the year of publication of the rst study of infant speech perception (Eimas,
Siqueland, Jusczyk, and Vigorito 1971). Since then, perception studies have solidly
documented infants remarkable early discriminatory capacities and the rapid
advances in knowledge of the ambient language that follow over the rst year of
life (see Jusczyk 1997; Kuhl 2004; and Vihman forthcoming 2014 for reviews),
while numerous studies demonstrating infant statistical learning (in language and
other areas) from an early age have expanded our understanding of the learning
mechanism that may underlie those advances (see Thiessen and Saffran 2007,
and Johnson and Tyler 2010 for alternative positions on the role of statistical
learning; Vihman forthcoming 2014: ch. 5 provides an overview). In addition,
several distinct methodological procedures have been used to trace and explore
the nature of early word-form learning over the rst two years of life.
2
The
resultant studies are of evident relevance to phonological development but
none are included here, as the addition of even a few would result in a far
1
longer and less focused volume (and the studies are readily available elsewhere).
Nevertheless, what we have learned about perception and early learning capacities
is critical to our understanding of the course of phonological development and
clearly complements the whole-word approach presented here.
Whole-word phonology and templates in phonological
development
As understood in the chapters that follow, templates involve (idiosyncratic)
prosodic structures that appear to be generalized, in different ways by different
children, from the forms of a childs earlier babble vocalizations and rst words.
Templates typically lead to increased similarity in the forms of the childs words at
the expense of accuracy (i.e., of match to the adult target form). This corresponds
to a sequence of, rst, item (or exemplar) learning, then distributional learning,
implicitly and automatically applied to repeatedly used child output forms the
presumed source of the generalizing of patterns to new targets. Taking an
exemplar model perspective, this generalization can be thought of as the self-
organization of the exemplar space, due to connections being formed between
similarly shaped child forms; an alternative (but not necessarily incompatible),
strictly sensorimotor perspective sees the generalization as no more than
the automatization of one or more well-practiced procedures, namely, the childs
emergent neuromotor word-production routines (McCune, this volume,
Chapter 16). The resultant patterns appear to constitute (unconscious or implicit)
child responses to the phonological challenges posed by target word forms. In
other words, the childs existing resources (familiar production routines) are
deployed to deal with what is novel and thus difcult to bring to mind, plan,
and produce as needed. Although this understanding of the function of templates
and of the mechanism underlying generalization and analogy (Macken 1979,
this volume, Chapter 5, p. 144) is relatively recent, the core papers depict
essentially the same learning sequences and the same conclusion as to the role
of templatic patterns as a way of dealing with challenges by bringing familiar
routines to bear on them.
Vihman and Velleman (2000) introduced the terminological distinction
between selected words, or child word forms that (roughly) match the form
of their adult targets while conforming to a childs preferred prosodic structure,
and adapted words, or word forms based on adult targets that are less similar
to the childs pattern, which the child thus modies more radically to arrive at
an output that ts the template. Examples of both selected and adapted words
can be found in many of the chapters of this book (see also Keren-Portnoy,
Majorano, and Vihman, 2009). The earliest papers make no mention of the
term template, let alone of selecting and adapting, yet the detailed data
presented by Priestly (1977, this volume, Chapter 7), for example, make it easy
to see that some words, such as lion, produced as [lajn], and whale [wjl]
(bisyllabic ordinary forms in Priestlys terms), are selected in our sense,
2 Marilyn M. Vihman and Tamar Keren-Portnoy
while others the bisyllabic experimental forms, which Priestly found to be
not only amusing but systematic (p. 217) are adapted: e.g., berries [bjas],
chocolate [kajak], peanut [pijat], and tiger [tajak].
From the childs point of view, there is presumably no essential difference,
except perhaps of degree, between the two kinds of words: things that are
similar are treated similarly. The targets for selected words are similar to
other selected word targets as well as to the childs own forms of those words.
The targets for adapted words are not as obviously similar to one another, yet
they must sufciently resemble other words rendered within the framework of
that particular template to attract the child into associating themwith the same
type of own (child) form. It is typical of the forms used under the inuence
of a childs dominant template that no attempt (by researchers) to separately
trace or relate each segment to its presumed model in the target word will yield
a satisfactory analysis (this is well exemplied by the data in both Waterson
1971 and Macken 1979, this volume Chapters 3 and 5 respectively, as both
investigators emphasize). Instead, we see the child matching the overall shape
of the adult word (CVC[C]V[C] in Priestlys examples), often including the
target syllable count, as here, and at least one of the consonants, while simplify-
ing the overall structure through repetition of segments or syllables or through
reordering to achieve a xed output structure for multiple lexical items
(here, CVjVC). In short, the term template is used to formalize the notion
of whole-word learning as the basis of a childs phonology.
It is important to note that templates are not a lasting element in a childs
phonological system, even for children learning the classic templatic adult
language, Arabic (see Khattab and Al-Tamimi, this volume, Chapter 14).
Instead, templates typically gain increasing dominance over a period of days,
weeks or months often beginning toward the middle or end of the single-word
period but then fade thereafter, as the child comes to master (in terms of
articulation, speech planning, and memory or representation) the more complex
sequences of the adult language: see Priestlys and Oliveira-Guimares
accounts of the rise and fall of templates in the phonological development of
one English and two Brazilian children respectively (this volume, Chapters 7
and 10), as well as Macken (1979, this volume, Chapter 5), for the emergence
of templates and the subsequent advance to accurate segmental sequences in
the speech of a Spanish-learning child, and Vihman and Vihman (2011), a
longitudinal account of the emergence, use, and fading of two templatic patterns
in a diary study of an Estonian- and English-learning childs rst 500 words.
Finally, note that the templatic shape itself is dynamic, changing in more or
less subtle ways over the period of time in which it holds sway as the childs
phonological knowledge increases and stabilizes, often with a period of
competition between variant solutions to the phonological challenge (see
Priestly 1977, this volume, Chapter 7; Macken 1979, this volume, Chapter 5;
Vihman and Velleman 1989, this volume, Chapter 8; Vihman, Velleman, and
McCune 1994, this volume, Chapter 9; and Oliveira-Guimares, this volume,
Introduction: the emergence of phonology 3
Chapter 10, as well as Menn and Matthei 1992, who discuss competition in
child rules or patterns).
Universals vs. typological and individual differences: the role
of rhythm
How does the child get started learning the phonetics and the phonology of the
ambient language? What resources are available for kick-starting the process?
It is worth considering the role of rhythm, in both perception and production,
as a theoretical and developmental starting point for the child, and one which
may go some way toward accounting for three separate aspects of child vocal
production: its initial universality, its typological variability by language of
exposure, and the individual differences found even within a single language
group, all of which are amply illustrated in the chapters of this book.
The earliest theoretical statements about the course of phonological
development those of Jakobson (1941/1968) were based on diary studies,
with their inevitable focus on the individual child and his or her early word
production. Nevertheless, the conclusions of that highly inuential rst attempt
at systematization, heavily shaped by the structuralist theoretical principles of
the Prague School of linguistics of which Jakobson was a key member, were
meant to serve as putative universals. Somewhat later, Brown (1958) provoca-
tively hypothesized that babbling thought by Jakobson to be unrelated to later
phonological development involved a phonetic drift in the direction of
the ambient language. It was only later still, when the wide availability of
rst audio and then video recording devices made possible far more reliable and
detailed phonetic observations of childrens speech and especially of their
prelinguistic vocalizations, that the wide range of individual differences in
pathways to language (even for children acquiring the same language) began
to become evident from production studies (see, e.g., Vihman, Macken, Miller,
Simmons, and Miller 1985; Vihman, Ferguson, and Elbert 1986; Menn and
Vihman 2011). All three of these characteristics of phonological development
must be encompassed in our understanding of this complex process: universals,
or the commonalities to be found in the babble and rst word production of
children learning any language; ambient language effects and their implications
for the mapping of what is perceived onto vocal production; and the variability
due to the contribution of the individual child, within the constraints of percep-
tion, the neurophysiology of vocal production, and cognitive development.
Perceptual experience of the dominant rhythms of the ambient language can
be taken to provide a phonological frame suitable for supporting rst word
forms (see Wauquier and Yamaguchi, this volume, Chapter 11, for evidence
of the impact of rhythm on template formation in French). In other words,
perceptual experience of the specic rhythms of the language will yield a typical
one- or two-syllable unit, based on stress and syllable type and weight, which
a childs immature and inexperienced phonological memory will retain and
use, rst in implicit segmentation (Nazzi, Iakimova, Bertoncini, Frdonie, and
Alcantara 2006; Hhle, Bijeljac-Babic, Herold, Weissenborn, and Nazzi 2009;
Pons and Bosch 2010), then in early attempts at production; this in turn will
tend to strengthen the patterns that the child has tuned into, resulting in more
ambitious targeting of adult words (i.e., of word targets beyond the childs
production abilities), which are thus adapted to a well-practiced pattern or
template. (For evidence that phonological memory is constructed through
use, see Keren-Portnoy et al. 2010.) The cross-linguistic data provided in this
volume are largely consistent with this proposal, as child templates are shaped
by target language affordances whose scope is typically a lexical unit (a word or
a short phrase) in interaction with the childs own babbling practice and rst
word production experience (through selecting and adapting).
The evidence from templates suggests that rhythm is critical here, providing a
perceptual envelope into which the childs individual production patterns can
be tted. As Brown anticipated, individual childrens vocal practice (babble)
gradually drifts toward (or is shaped by) the rhythms of input speech
(Boysson-Bardies, Hall, Sagart, and Durand 1989; Boysson-Bardies and
Vihman 1991); this implicit sensorimotor experience of babbling is a critical
mechanism for transforming heard speech patterns into the production base for
word learning a different base for different children, despite broadly similar
input and neuromotor constraints. It is this prosodic framing of speech sequences
that eventually leads to the individual but ambient-language-inuenced phono-
logical templates.
In contrast to the implicit shaping of babble by perceived input speech, the
integration of what is heard with what can be produced as learned word forms
is neither automatic nor effortless. Furthermore, because this integration will
depend on such individually variable factors as the particular characteristics of a
childs babble, emergent representational ability, and volubility or sociability,
among other things, we should not be surprised at the wide variability identied
in production even among children learning the same language. In general, the
patterns that we nd described in this volume, for one or more children per
language, broadly reect the prosody of the individual language and support
the notion that rhythm is an important starting point for phonology (for further
discussion, see Wauquier and Yamaguchi, this volume, Chapter 11).
Whole-word learning from the perspective
of an exemplar model
In what sense is the phenomenon that we have been describing whole-word
learning? The childs rendition of the word shows sensitivity to some of the
segments or phonetic features that occur in it. However, it does not necessarily
maintain the order in which the segments or features appear, but may instead
redistribute, merge or spread some of those features. This is seen as whole-word
learning because it is within the lexical unit (a word or a short but often repeated
phrase: all gone, in there, whats this?) that this lack of conformity to the
identity or ordering of parts is observed. Within the lexical unit there may be
no clear evidence that the child has registered information about the identity
and number of all of the segments (as perceived by adult speakers) or their
relative order. Based on evidence fromproduction, then, children seemto have a
representation or memory trace of the adult form, but that representation is not
constructed out of an ordered sequence of segments.
This claim, that the childs representation lacks a clear structure made up of
neatly ordered parts, has often been misunderstood. The failure to appreciate
what is meant here has led some researchers to ascribe to the proponents of
whole-word phonology, or holistic representations, the claim that such rep-
resentations are vague or underspecied (Gerken, Murphy, and Aslin 1995;
Swingley and Aslin 2002, 2007; Storkel and Maekawa 2005); holistic repre-
sentations are contrasted here with segmentally detailed representations
(Storkel and Maekawa 2005) that are characterized by phonetic specication
(Swingley and Aslin 2002). Gerken et al. (1995) aptly present this viewpoint:
Children represent early words in terms of holistic properties, such as prosodic
structure and acoustic shape, or in terms of phonetic features that are not
bundled into individual segments (p. 476). In fact, as we understand them,
these child representations include abundant detail much more than is appa-
rent in phonemic or even broad phonetic description.
Taking an exemplar model perspective on whole-word learning, let us con-
sider what whole-word learning might be like. As suggested by Pierrehumbert
(2003), the perceptual input for speech is an auditory coding of the speech
signal. Acovering map provides an analog representation of the phonetic space,
with the dimensions being the many phonetic parameters which are relevant to
speech perception (p. 132; see also Edwards, Munson, and Beckman 2011).
Thus, for infants the representation is highly detailed, perhaps hyper-detailed,
or even overly detailed in some aspects but less so in others. In addition,
since infants need not at rst know which acoustic parameters are relevant for
speech perception, they may assign weights to parameters differently than
adults would.
Something Pierrehumbert does not mention, but which may also affect infor-
mation processing in the young child, is salience: parts of the acoustic signal
which are less readily perceived (shorter, lower pitch, quieter typically,
unstressed) may be processed less successfully, with more error or more loss
of information (as shown in Vihman, Nakai, DePaolis, and Hall 2004). Since
unstressed parts of words also tend to be produced with increased motoric
variability (Goffman, Gerken, and Lucchesi 2007), the unstressed parts of differ-
ent exemplars of the same word would differ more, leading those parts to be less
coherently represented; that is, their representations would contain more varia-
bility or noise. In a noisy or variable exemplar space the treatment of a newly
encountered exemplar as belonging, or not, to the particular category will be less
consistent.
Note that our claim is that young childrens exemplar space is sparser and
more variable than adults, with less clearly dened clumps or categories, and
that it therefore functions with less clearly dened boundaries for what does or
does not fall within each category. The more frequently a child encounters or
produces the exemplars of a given lexical type or structure, the sharper will be
the organization of the corresponding portion of exemplar space (see Ota, this
volume, Chapter 15).
There is no vagueness or lack of detail in this scenario. What is lacking
is segmental organization, or a tidy organization into sequentially ordered
time-bound units, each built of a unique co-occurring set of features. It is this
abstract level of categorization that is missing, not ne detail. In this sense
the childs representations, based on the evidence of phonological templates,
is both richer and poorer than what is implied by standard phonetic tran-
scription: It is rich in featural texture but poor in sequential organization. Nor
does the interpretation of child phonological representations as lacking seg-
mental units constitute a problem for continuity between child and adult
phonological knowledge, to the extent that some theoretical models similarly
deny any such organization for adult representations (Browman and Goldstein
1989, 1991, 1992; Pierrehumbert 2003; Edwards, Beckman, and Munson
2004; Edwards, Munson, and Beckman 2011; Munson, Edwards, and
Beckman 2012).
The orientation of this volume
In the 1970s three papers appeared that have since become classics: Waterson
(1971, this volume, Chapter 3) took a Firthian approach to one childs phonol-
ogy and introduced the notion of schemas, or child-specic word patterns;
Ferguson and Farwell (1975, this volume, Chapter 4) argued for whole-word
or lexical patterns as the core of adult as well as child phonological knowledge;
and Macken (1979, this volume, Chapter 5) demonstrated the unusual adult-to-
child-form mappings that can be found in early phonology to meet the childs
constraints on output forms. These papers all stood outside of phonological
theory as it was understood at the time, shortly after publication of the denitive
statement of generative phonology, Chomsky and Halle (1968). As it happened,
that formalization was about to be superseded by the range of new perspectives
that emerged in response to the perceived limitations of Chomsky and Halles
approach (see Van der Hulst and Smith 1982; Anderson 1985; Goldsmith 1995;
and Scheer 2013).
3
This period in the study of phonological development
culminated in the widely cited paper by Menn (1983, this volume, Chapter 6),
who adopted a psycholinguistic perspective and formulated the two-lexicon
model (for a rethinking of this model, see Menn and Matthei 1992; Menn,
Schmidt, and Nicholas 2009, as well as this volume, Chapter 17).
In the period that followed, phonological theory blossomed and expanded,
diversifying into a range of distinct theories, including CV phonology (Clements
and Keyser 1983), Lexical Phonology (Mohanan 1986), Autosegmental and
Metrical Phonology (Goldsmith 1990), Dependency Phonology (Durand
1990), Government Phonology (Kaye, Lowenstamm, and Vergnaud 1990),
Declarative Phonology (Coleman 1998), and most recently CVCV phonology
(Scheer 2004).
4
However, during the 1990s one of the new models, Optimality
Theory (OT: Prince and Smolensky 1992/2004), began to dominate the eld, to
the point that it came often to be the only theoretical perspective presented to
linguistics students. A number of attempts have been made to cast phonological
development in terms of OT(see Boersma and Levelt 2003); the studies collected
in Kager, Pater, and Zonneveld (2004) are dedicated to the presentation of
acquisition data from an OT perspective. Yet no extensive OT treatment of data
fromone or more children has appeared to date. The present volume returns to the
whole-word phonology approach, which has much in common with the early
work of McCarthy and Prince (1986, 1993, 1995) but which, on the basis of
extensive cross-linguistic studies of child data, diverges sharply from OT, with
its reliance on Universal Grammar and markedness theory and the tendency of
its advocates to expect linear advances along with set stages of development and
across-the-board changes in child forms.
As Menn and her colleagues point out in their recent efforts to model what we
know about how children learn phonology (Menn et al. 2009; see also Menn
et al. this volume, Chapter 17), any adequate theory of phonological develop-
ment must be able to account for three key ndings, all solidly grounded in forty
years of empirical research:
1. individual differences across children,
2. lexical variation within a given child,
3. the phenomenon of regression (nonlinear advance, or the U-shaped curve),
in which early accuracy is succeeded by less accurate, more child-specic
word forms only to be followed, much later, by a return to adultlike forms,
or relative accuracy.
No theoretical approach that sees phonological development as the automatic
suppression of innate processes (Stampe 1969), across-the-board changes in rule
application (Smith 1973), triggering of parameters (Fikkert 1994) or reordering
of constraints (various chapters in Kager et al. 2004) can account for these core
characteristics in any straightforward way.
Vihman and Croft (2007, this volume, Chapter 2) propose a new way of
thinking about phonology, based on the ideas of Charles Ferguson and the
evidence from child data, with specic reference to the three characteristics
listed above. Menn and her colleagues (this volume, Chapter 17), who focused
on OT in their 2009 critique of what has been missing in theories of
phonological acquisition to date, now propose to extend Vihman and Crofts
exemplar model by including some key missing elements namely, (i) a role
for representation of the adult target form; (ii) mappings from input to output
(corresponding to the inuential rules or processes of earlier generative
models such as Stampe 1969 and Smith 1973); and (iii) mappings from output
to input, this latter an important characteristic not yet incorporated in any purely
phonological theory.
The concluding section of this volume includes both Menn et al.s future-
oriented proposals for extension and revision of Vihman and Crofts model
and McCunes developmentally oriented thoughts on the importance of the
study of phonological development as a whole, and of early word templates
in particular, for child language development. These discussion chapters
clash on a number of specic points, which can only be a healthy symptom
of the liveliness of creative thinking in our eld, even within the scope of
a broadly similar theoretical (here, functionalist) inclination. Beneath the
evident differences in dening the notion of representation, in particular
the editors of this volume nd some deeply rooted similarities. Specically,
the interrelatedness of representations in a network of potential associations,
both formal and meaning-related, within a broadly neurological framework
seems to us to emerge from both of these chapters. Based on this idea we can
conceptualize the potentiality or transitory nature of representations, such
that they only come into existence in moments of what, in adults, could be
termed consciousness (a better expression, for developmental purposes,
might be moments of use, for either speaker or listener). To return to the
exemplar metaphor, individual instances (or their subparts) are activated to
differing degrees in different situations of use, when activation lights up
differing elements in the network of associations. This would mean that
the representation of a given item has no essential stability over time,
especially in the early period of phonological development. It is our hope
that the contrasting perspectives provided will lead to discussion, debate, and
further empirical research.
The contents of this volume
Three basic considerations guided the choice of papers for this volume. First,
in Part II (Setting papers) we sought to provide a fair representation of the
core papers that gave rise to the whole-word approach. Secondly, in Part III
(Cross-linguistic studies) we included empirical papers that work through the
implications of this approach. We limited ourselves to data-oriented papers in
which the childs word forms are presented in sufcient numbers to give the
reader a clear understanding of the shape of his or her emerging phonology,
excluding papers with only anecdotal mention of specic forms to support a
rule or constraint. Most of the chapters in this volume also take an overtly whole-
word approach, but that is not the case with all of them(neither Ota, Chapter 15,
nor Priestly, Chapter 7, have any such explicit orientation, for example).
Thirdly, this volume is specically designed to provide data and analysis
exemplifying the ways in which these fundamental properties of phonological
development are manifested in a range of different languages and child learners.
The diversity of languages included makes it possible to document, for
example, the exceptional salience of geminates for children acquiring languages
that have them here, Finnish (Savinainen-Makkonen, Chapter 13) and
Arabic (Khattab and Al-Tamimi, Chapter 14) or the unavoidability of
learning clusters early in a language in which they are particularly common
(and one childs solution to that problem: see Szreder, Chapter 12). (Of the
chapters in Part II, only Mackens, Chapter 5, concerns a language other than
English.) This will enable a serious student of child phonology to see just what
kinds of data need to be accommodated, not in one child or one language alone
but in a broad sample and also to see the lines of similarity, patterns, and
limitations or constraints that recur in one child and one language after another,
although not always in the form predicted by adult-theory-based notions of
markedness (see also Vihman and Kunnari 2008). However, the typological
study of child phonological development is certainly still in its infancy. We hope
that this volume will stimulate empirical studies of children learning a far wider
range of languages.
We believe that whole-word phonology (in Ferguson and Farwells
terms), or Templatic Phonology (Vihman and Croft 2007), provides a
model that, while still limited in many ways, is at a minimum faithful to the
evidence afforded by large quantities of child data. This is one of the key
legacies of Charles Fergusons approach: the empirical data are allowed to
speak although the interpretation will necessarily be inuenced by the
investigators training and habits of mind. We hope that this edited volume,
with its mix of classics and both old and new data-based papers as well as
contemporary re-evaluations from the points of view of both linguistics and
developmental psychology, will bring the whole-word phonology approach to
the attention of a new generation of linguists, psychologists, psycholinguists,
and speech scientists.
notes
1. Papers included in this volume are indicated in bold face; those published here for
the rst time are cited without mention of year of publication.
2. Some of these latter studies are specically designed to challenge the notion of
whole-word phonology; we discuss below some of the ways in which this idea
has been interpreted (or misinterpreted). Although we cannot here enter into a
discussion of the differences between experimental responses to a limited number
of stimuli and spontaneous speech production, Vihman, DePaolis, and Keren-Portnoy
(2009) discuss the issue briey, while Vihman (forthcoming 2014: ch. 7) is devoted to
Experimental studies of word form learning.
3. Smith (1973), still the most extensive analysis of any one childs phonology, came out
in the same period. Smiths study held closely to the mainstream formalization
represented by Chomsky and Halle and purported to demonstrate in direct contra-
diction to the studies included in this volume that there is no basis for assuming that
the childs word forms reect an independent system.
4. Some of these models overlap with or subsume others.
References
Anderson, S. R. (1985). Phonology in the twentieth century: theories of rules and
theories of representations. Chicago University Press.
Boersma, P. and Levelt, C. (2003). Optimality theory and phonological acquisition.
Annual Review of Language Acquisition, 3, 150.
Boysson-Bardies, B. de, Hall, P., Sagart, L., and Durand, C. (1989). A crosslinguistic
investigation of vowel formants in babbling. Journal of Child Language, 16, 117.
Boysson-Bardies, B. de and Vihman, M. M. (1991). Adaptation to language: evidence
from babbling and rst words in four languages. Language, 67, 297319.
Browman, C. P. and Goldstein, L. (1989). Articulatory gestures as phonological units.
Phonology, 6, 20151.
(1991). Gestural structures: distinctiveness, phonological processes and historical
change. In I. G. Mattingly and M. Studdert-Kennedy (eds.), Modularity and the
motor theory of speech perception, pp. 31338. Hillsdale, NJ: Lawrence Erlbaum.
(1992). Articulatory phonology: an overview. Phonetica, 49, 15580.
Brown, R. (1958). Words and things. Glencoe, IL: Free Press.
Chomsky, N. and Halle, M. (1968). The sound pattern of English. New York: Harper &
Row.
Clements, G. N. and Keyser, S. J. (1983). CV Phonology: a generative theory of the
syllable. Cambridge, MA: MIT Press.
Coleman, J. S. 1998. Phonological representations: their names, forms and powers.
Cambridge University Press.
Durand, J. (1990). Generative and nonlinear phonology. London: Longman.
Edwards, J., Beckman, M. E., and Munson, B. (2004). The interaction between vocabulary
size and phonotactic probability effects on childrens production accuracy and uency
in nonword repetition. Journal of Speech, Language, and Hearing Research, 47,
42136.
Edwards, J., Munson, B., and Beckman, M. E. (2011). Lexiconphonology relationships
and dynamics of early language development a commentary on Stoel-Gammons
Relationships between lexical and phonological development in young children.
Journal of Child Language, 38, 3540.
Eimas, P. D., Siqueland, E. R., Jusczyk, P. W., and Vigorito, J. (1971). Speech perception
in infants. Science, 171, 3036.
Ferguson, C. A. and Farwell, C. B. (1975). Words and sounds in early language acquis-
ition. Language, 51, 41939. Reprinted in W. S-Y. Wang, The lexicon in phono-
logical change. The Hague: Mouton (1977). Reprinted in this volume as Chapter 4.
Fikkert, P. (1994). On the acquisition of prosodic structure. PhD dissertation, University
of Leiden (HIL Dissertations 6). The Hague: Holland Academic Graphics.
Gerken, L. A., Murphy, W. D., and Aslin, R. N. (1995). Three- and four-year-olds
perceptual confusions for spoken words. Perception and Psychophysics, 57,
47586.
Goffman, L., Gerken, L. A., and Lucchesi, J. (2007). Relations between segmental and
motor variability in prosodically complex nonword sequences. Journal of Speech,
Language, and Hearing Research, 50, 44458.
Goldsmith, J. A. (1990). Autosegmental and metrical phonology. Oxford: Blackwell.
(ed.) (1995). The handbook of phonological theory. Oxford: Blackwell.
Hhle, B., Bijeljac-Babic, R., Herold, B., Weissenborn, J., and Nazzi, T. (2009).
Language-specic prosodic preferences during the rst half year of life: evidence
from German and French infants. Infant Behavior and Development, 32, 26274.
Jakobson, R. (1941/1968). Child language, aphasia, and phonological universals, trans.
A. R. Keiler. The Hague: Mouton. (Originally published as Kindersprache, Aphasie
und allgemeine Lautgesetze. Uppsala: Almqvist & Wiksell, 1941.)
Johnson, E. K. and Tyler, M. (2010). Testing the limits of statistical learning for word
segmentation. Developmental Science, 13, 33945.
Jusczyk, P. W. (1997). The discovery of spoken language. Cambridge, MA: MIT Press.
Kager, R., Pater, J., and Zonneveld, W. (eds.) (2004). Constraints in phonological
acquisition. Cambridge University Press.
Kaye, J., Lowenstamm, J., and Vergnaud, J. (1990). Constituent structure and govern-
ment in phonology. Phonology, 7, 193231.
Keren-Portnoy, T., Majorano, M., and Vihman, M. M. (2008). From phonetics to phonol-
ogy: the emergence of rst words in Italian. Journal of Child Language, 36, 23567.
Keren-Portnoy, T., Vihman, M. M., DePaolis, R., Whitaker, C., and Williams, N. M.
(2010). The role of vocal practice in constructing phonological working memory.
Journal of Speech, Language, and Hearing Research, 53, 128093.
Kuhl, P. K. (2004). Early language acquisition: cracking the speech code. Nature
Reviews Neuroscience, 5, 83143.
Macken, M. A. (1979). Developmental reorganization of phonology: a hierarchy of basic
units of acquisition. Lingua, 49, 1149. Reprinted in this volume as Chapter 5.
McCarthy, J. J. and Prince, A. S. (1986). Prosodic morphology. MS, University of
Massachusetts, Amherst, and Brandeis University.
(1993). Prosodic morphology, I: Constraint interaction and satisfaction. MS, University
of Massachusetts, Amherst, and Rutgers University, New Brunswick.
(1995). Prosodic morphology 1. In J. A. Goldsmith (ed.), The handbook of phono-
logical theory, pp. 31866. Oxford: Blackwell.
Menn, L. (1983). Development of articulatory, phonetic and phonological capabilities. In
B. Butterworth (ed.), Language production, vol. 2, pp. 350. London: Academic
Press. Reprinted in this volume as Chapter 6.
Menn, L. and Vihman, M. M. (2011). Features in child phonology: inherent, emergent, or
artefacts of analysis? In N. Clements and R. Ridouane (eds.), Where do phonolog-
ical features come from? The nature and sources of phonological primitives,
pp. 261301. Amsterdam: John Benjamins.
Menn, L. and Matthei, E. (1992). The two-lexicon account of child phonology. In
C. A. Ferguson, L. Menn, and C. Stoel-Gammon (eds.), Phonological development:
models, research, implications, pp. 21147. Timonium, MD: York Press.
Menn, L., Schmidt, E., and Nicholas, B. (2009). Conspiracy and sabotage in the
acquisition of phonology: dense data undermine existing theories, provide scaffold-
ing for a new one. Language Sciences, 31, 285304.
Mohanan, K. P. (1986). The theory of lexical phonology. Dordrecht: D. Reidel.
Munson, B., Edwards, J., and Beckman, M. E. (2012). Phonological representations in
language acquisition: Climbing the ladder of abstraction. In A. Cohn, C. Fougeron,
and M. Huffman (eds.), Oxford handbook of laboratory phonology, pp. 288309.
Oxford University Press.
Nazzi, T., Iakimova, G., Bertoncini, J., Frdonie, S., and Alcantara, C. (2006). Early
segmentation of uent speech by infants acquiring French. Journal of Memory and
Language, 54, 28399.
Pierrehumbert, J. (2003). Phonetic diversity, statistical learning and acquisition of
phonology. Language and Speech, 46, 11554.
Pons, F. and Bosch, L. (2010). Stress pattern preference in Spanish-learning infants: the
role of syllable weight. Infancy, 15, 22345.
Priestly, T. M. S. (1977). One idiosyncratic strategy in the acquisition of phonology.
Journal of Child Language, 4, 4566. Reprinted in this volume as Chapter 7.
Prince, A. and Smolensky, P. (2004). Constraint interaction in generative grammar. In
J. J. McCarthy (ed.), Optimality Theory in phonology: a reader, pp. 371. Oxford:
Blackwell. (Originally circulated as Optimality: constraint interaction in genera-
tive grammar, MS, 1992.)
Savinainen-Makkonen, T. (2007). Geminate template: a model for rst Finnish words.
First Language, 27, 34759. Reprinted in this volume as Chapter 13.
Scheer, T. (2004). A lateral theory of phonology: what is CVCVand why should it be?
Berlin: Mouton de Gruyter.
(2013). Aspects of the development of generative phonology. In N. C. Kula, B. Botma,
and K. Nasukawa (eds.), The Bloomsbury companion to phonology, pp. 397446.
London: Bloomsbury.
Smith, N. V. (1973). The acquisition of phonology: a case study. Cambridge University
Press.
Stampe, D. (1969). The acquisition of phonetic representation. Papers from the Fifth
Regional Meeting of the Chicago Linguistic Society, Chicago, IL. Reprinted in
D. Stampe, A dissertation on natural phonology. New York: Garland, 1979.
Storkel, H. L. and Maekawa, J. (2005). A comparison of homonym and novel word
learning: the role of phonotactic probability and word frequency. Journal of Child
Swingley, D. and Aslin, R. N. (2002). Lexical neighborhoods and the word-form repre-
sentations of 14-month-olds. Psychological Science, 13, 4804.
(2007). Lexical competition in young childrens word learning. Cognitive Psychology,
54, 99132.
Thiessen, E. D. and Saffran, J. (2007). Learning to learn: infants acquisition of stress-based
strategies for word segmentation. Language Learning & Development, 3, 73100.
van der Hulst, H. and Smith, N. (1982). The structure of phonological representations.
Dordrecht: Foris.
Vihman, M. M. (Forthcoming 2014). Phonological development: the rst two years. 2nd
edn. Oxford: Blackwell.
Vihman, M. M. and Croft, W. (2007). Phonological development: toward a radical
templatic phonology. Linguistics, 45, 683725. Reprinted in this volume as
Chapter 2.
Vihman, M. M., DePaolis, R. A., and Keren-Portnoy, T. (2009). A Dynamic Systems
approach to babbling and words. In E. Bavin (ed.), Handbook of child language,
pp. 16382. Cambridge University Press.
Vihman, M. M., Ferguson, C. A., and Elbert, M. (1986). Phonological development from
babbling to speech: common tendencies and individual differences. Applied
Psycholinguistics, 7, 340.
Vihman, M. M. and Kunnari, S. (2008). The sources of phonological knowledge: a cross-
linguistic perspective. Recherches Linguistiques de Vincennes, 35, 13364.
Vihman, M. M., Macken, M. A., Miller, R., Simmons, H., and Miller, J. (1985). From
babbling to speech: a reassessment of the continuity issue. Language, 61,
395443.
Vihman, M. M., Nakai, S., DePaolis, R. A., and Hall, P. (2004). The role of accentual
pattern in early lexical representation. Journal of Memory and Language, 50,
33653.
Vihman, M. M. and Velleman, S. (1989). Phonological reorganization: a case study.
Language and Speech, 32, 14970. Reprinted in this volume as Chapter 8.
(2000). Phonetics and the origins of phonology. In N. Burton-Roberts, P. Carr, and
G. Docherty (eds.), Phonological knowledge: its nature and status, pp. 30539.
Vihman, M. M., Velleman, S. L., and McCune, L. (1994). How abstract is child phonol-
ogy? Towards an integration of linguistic and psychological approaches. In M. Yavas
(ed.), First and second language phonology, pp. 944. San Diego: Singular
Publishing. Reprinted in this volume as Chapter 9.
Vihman, M. M. and Vihman, V-A. (2011). From rst words to segments: a case study in
phonological development. In E. V. Clark and I. Arnon (eds.), Experience, variation
and generalization: learning a rst language, pp. 10933. Amsterdam: John
Benjamins.
Waterson, N. (1971). Child phonology: a prosodic view. Journal of Linguistics, 7,
179211. Reprinted in this volume as Chapter 3.
Part I
The current framework
2 Phonological development: toward a radical
templatic phonology
Marilyn M. Vihman and William Croft
1. Introduction
In this chapter we argue for a template-based approach to segmental phonological
representation. Our central theoretical hypothesis is that the segmental phono-
logical structure of words is represented as language-specic phonotactic
templates (the latter including syllable structure and other higher-order
structures such as metrical structure).
1
We present cross-linguistic evidence
from phonological development that supports a template-based approach to
phonological representation. We argue, however, that the template-based
approach is equally suited to the analysis of adult phonology. Research in
more phonetically oriented approaches towards phonological categories, and in
usage-based or exemplar models of the representation of phonological knowl-
edge, also supports a template-based approach to the representation of the
phonological structure of words. We take this research (and ours) to its logical
conclusion, arguing that it applies to more abstract phonological categories and
adult phonologies as well. Before turning to this evidence, we briey discuss three
general issues that have led us to this approach to phonological representation.
The rst issue is the relationship between language structure and language
function, namely, communication for the purposes of social interaction
(see Clark 1996; Keller 1994). The hypothesis that we propose, following
many others, is that the starting point for the analysis of linguistic structure
should be the soundmeaning link that denes linguistic signs or symbols. This
hypothesis does not rule out the possibility that generalizations about linguistic
structure, including phonological structure, may be separated from general-
izations about their function. Indeed, there is much arbitrariness in language,
most notably the arbitrariness of the association of a phonological form with a
particular meaning in a particular language. Also, as is well known, the phono-
logical organization of a word into syllables often fails to match the morpho-
logical composition of a word. But we will argue below that the basic
phonological unit is a word template, specically dened on a phonological
unit that is also a fundamental symbolic unit.
2
We will argue that starting from
words can solve certain theoretical and empirical problems that arise for reasons
not directly connected to language function and, furthermore, that this reects
the developmental learning sequence.
17
The second issue is the empirical range of a linguistic theory. A central
fact about linguistic data is the pervasiveness of variation: variation across
languages, across dialects, across speakers, across utterances by an individual
speaker, and also variation in the behavior of linguistic units across linguistic
contexts. We do not believe it is appropriate to abstract away from empirical
variation, or to attempt to explain it away (e.g., by positing separate invariable
grammars; see, e.g., Croft 2000: 513). Instead, we seek a model of grammatical
representation that will accommodate this variation. The need to accommodate
the full range of variation observed within and across languages will play a central
role in our arguments for a template-based approach to segmental phonology.
The third issue is the relationship between a linguistic theory and psychological
plausibility. In many linguistic theories, it is common to separate grammatical
competence from performance, and to evaluate competence theories on the basis
of principles of simplicity and generality, leaving aside performance or even the
precise psychological implementation of the competence module. But simplicity
and generality are a priori formal criteria, not psychological ones. Moreover,
separating competence from performance makes it impossible to subject compe-
tence models to empirical psycholinguistic evaluation.
We consider it to be preferable (other things being equal) to posit a unied
model of grammatical representation that does not separate a competence
module from its psychological implementation, or from actual language pro-
cessing (compare Bybee 2001: 8). In particular, psycholinguistic evidence
should be relevant to the evaluation of theories of grammatical representation.
In this chapter, we focus on the representation of the phonological structure of
linguistic units. We draw on another type of psycholinguistic evidence, namely,
that afforded by language development, to support a template-based approach
to phonological representation.
The developmental data that we bring to bear on the question of word
templates in phonology raises a nal general issue: the relationship between
child language data and data derived from adult linguistic behavior. Only the
latter are normally used as a basis for theories of linguistic representation. Such
theories are then applied to rst language acquisition data. Often there are
substantial discrepancies between the hypothesized adult system and the devel-
oping child system. In this situation, two opposing proposals are typically made.
The discontinuity hypothesis maintains that the process by which
language is learned and the representations developed by the child are different
from those that are found in the adult system and must therefore somehow be
replaced by the adult system at a later stage of development. The discontinuity
hypothesis is unattractive because it seems to make little or no connection
between what the child knows and does and what the adult knows. It also
appears to insulate the theory of the adult system from any potentially discon-
rming data from child language development.
The continuity hypothesis maintains that the child already knows the
adult system (because many aspects of it are innately specied). The inability of
18 Marilyn M. Vihman and William Croft
the child to exhibit adult linguistic behavior is taken to be due to performance
and other limitations (or in one variant, to the need for innate capacities to
mature over time). The continuity hypothesis is also unattractive in that it too
appears to insulate the competence model of the adult from any potentially
disconrming developmental data.
We suggest that it is preferable to develop a theory of linguistic representation
that draws on developmental as well as adult data from the outset. Such a theory
will view the development of knowledge of linguistic structure as a gradual
process, assuming neither full adult competence from the beginning nor a
discontinuity between developmental stages and adult outcome. The template-
based approach to segmental phonology constitutes such a theory. It proposes
that a limited number of specic, actual word shapes are the rst steps in
phonological learning. The child gradually develops rst one or a small number
of phonological templates, then a wider variety of them, while at the same time
inducing a range of other phonological categories and structures from the
known word shapes. The result of differentiating and generalizing knowledge
of the phonological structure of words in the course of language acquisition is
an adult template-based model of phonological representation, with neither
discontinuity nor an assumption of pre-specied adult competence.
2. Word templates in early phonological development
2.1. A brief history
For over thirty years child phonologists have been claiming that the earliest
phonological structure is whole-word based. Perhaps the simplest expression of
the idea is that of Francescato (1968: 148) (who makes reference to Reichling
1935): Children never learn sounds: They only learn words, and the sounds are
learned through words. At the time that the idea was rst seriously put forward,
infant speech perception had not yet begun to be investigated and there were
few, if any, acoustic studies of childrens word production. Nevertheless, the
pioneering studies in child phonology made some fundamental observations,
while later, more detailed studies have provided further support for the basic
idea of whole-word phonological development.
In 1971 two diary studies, one American (Menn), one British (Waterson,
whose work is rooted in the Firthian tradition; see also Menn 1983; Waterson
1987), provided empirical data that seemed to point to the idea that the
whole word was at the core of a childs early phonology. Concluding a close
analysis of her son Daniels rst words, Menn (1971) suggested that the facts
that simplifying is principally by assimilation embracing the whole monosyl-
lable, all simplifying is done within word boundaries, [and] . . . there is no
conditioning across word boundaries indicate that the word is an entity, stored
and accessed as a block (Menn 1971: 247, emphasis ours). Daniels
assimilation embracing the whole monosyllable generally involved velar
Phonological development: toward a radical templatic phonology 19
harmony (e.g., at 22 months, when systematic forms began to appear: [gk]
cracker, [gg] bug, [gk] truck).
It has since become clear, partly through Menns own later work, that a
number of qualications have to be made to this summary of the facts.
We now know that conditioning can also occur across word boundaries, for
example (see Donahue 1986; Stemberger 1988; Matthei 1989; Menn and
Matthei 1992). Furthermore, there is no reason to equate the word with the
monosyllable, outside of an English language context. Disyllables dominate the
early lexicon of children acquiring most of the other languages in which early
word phonology has been extensively investigated, through either diary or
observational studies (Estonian, Finnish, French, Greek, Hebrew, Hindi,
Italian, Japanese, Spanish, Swedish, Welsh). The Germanic languages generally
may constitute exceptions, as monosyllables appear to be the most common
early word form in Dutch (e.g., Elbers and Ton 1985) and German (Leopold
1939; Elsen 1996) as well as English; for Swedish our data showthat mono- and
disyllabic early word forms are in close balance. Table 2.1 indicates proportions
of word targets of differing lengths in a cross-linguistic sample of early word
data, with 325 children represented in each language group. However, it
remains the case that the fundamental intuition that whole words are at
the core of early phonology was convincingly illustrated in Menn 1971 and
Waterson 1971 for the rst time.
Table 2.2 illustrates the type of phenomenon with which Waterson 1971 was
concerned, drawing on data from her son P.
This childs forms are less closely related to their adult targets than were those
that Menn reported for Daniel. Perhaps for this reason Waterson draws more
radical conclusions in attempting to account for her ndings:
Table 2.1. Mean length in syllables for early word targets in seven
languages
a
(ordered by proportion of monosyllables)
Language (N children) 1-syl. 2-syls. 3+-syls. Mean words per child
English (5) .59 .35 .06 120
Swedish (5) .44 .52 .04 106
Welsh (5) .36 .54 .10 53
Estonian (3) .33 .58 .09 48
French (5) .28 .68 .04 114
Finnish (10) .18 .79 .03 133
Italian (25) .17 .58 .26 22
Mean .34 .58 .09
a
For additional detail regarding the data summarized here, see Vihman (1996) (English
and French), Vihman, Boysson-Bardies, Durand, and Sundberg (1994a) (Swedish),
Vihman and DePaolis (2000), Vihman, Nakai, and DePaolis (2006) (Welsh), Krgvee
(2001), Salo (1993) and Vihman (1976) (Estonian), Kunnari (2000) (Finnish), and
DOdorico, Carubbi, Salerni, and Calvo (2001) (Italian).
It . . . seems reasonable to consider that a child perceives some sort of schema in words or
utterances through the recognition of a particular selection of phonetic features . . . which
go into the composition of the forms of the words or groups of words, and this
recognition of a schema results in his producing words of the same type of structure
for such adult forms. (Waterson 1971: 206)
Unfortunately Watersons insistence on perception as the source of her sons
early word schemas was never convincingly supported by direct evidence
(see Waterson 1987 for some attempts to provide such evidence, however),
and the idea that the childs patterns derive from what is salient in the target
words, although plausible, remains only an idea, since the evidence so far
inheres primarily in the production data themselves a problematic circularity.
Ferguson, Peizer, and Weeks (1973) were sufciently impressed by their
data, drawn from a case study of Weeks granddaughter (see also Weeks 1974),
to assert that for the adult we may assume that the predominant [phonological]
unit is the phoneme . . . [whereas] for many children the earliest domain seems
to be the entire lexical unit . . . (p. 57). Two years later, basing themselves
primarily on their analysis of longitudinal rst word data from three children
(including those of the EnglishGerman bilingual child Hildegard, as docu-
mented by her father, Leopold 1939), Ferguson and Farwell (1975) published
the classic statement of the whole word position, which they extended to adult
phonology as well:
The data and analysis of this study suggest a model of phonological development and
hence of phonology which is very different from those in vogue among linguists.
The model would de-emphasize the separation of phonetic and phonemic development
[i.e., contra Jakobson 1941/1968], but would maintain in some way the notion of
contrast . . . It would emphasize individual variation . . . but would incorporate the
notion of universal phonetic tendencies . . . It would emphasize the primacy of lexical
items . . . but provide for a complex array of phonological elements and relations . . .
(Ferguson and Farwell 1975: 437)
This position has been cited repeatedly but has only recently begun to receive
empirical investigation. Studies with adults over the last ve years or so have
shown that phonotactic familiarity effects, based on relative frequency of
occurrence of segments and segmental sequences, facilitate (speed up) the
Table 2.2. Ps early word templates: nasal structure
(age 1;6) VV (adapted from Waterson 1971)
Child form Adult target
[aa] another
[ee], [ii] nger
[a] Randall
[ee] window
processing of nonwords, although competitive effects deriving from known
lexical items (similarity neighborhoods) tend to slow processing of real words
in dense neighborhoods (see Vitevich, Luce, Charles-Luce, and Kemmerer
1997; Vitevich and Luce 1998, 1999). Similarly, Beckman and Edwards
(2000a) found that familiarity with particular phonemic sequences resulted in
more accurate repetition of nonwords by three- to four-year-olds (see also
Edwards, Beckman, and Munson 2004).
The idea of whole-word phonology was further extended and more tightly
dened by Macken (1979), who summed up her analysis of the early phonology
of a Spanish-speaking child by noting that [a number of] unusual substitutions
can be accounted for by the overgeneralization of . . . preferred word patterns . . .
Prosodic similarity between certain adult words provides a plausible explan-
ation for the similar treatment of some words (p. 29). Macken alludes to word
templates here (preferred word patterns) and appears to be agreeing with
Waterson in nding a probable source for the childs patterns in the prosodic
similarity of words in the adult language. Based on her detailed longitudinal
case study, she goes on to adumbrate her ndings for the early word learning
period: all words had a consistent word pattern form; . . . new patterns resulted
from the expansion of previously acquired word patterns; some words changed
patterns over time as new word patterns were learned (Macken 1979: 34).
We will see that this description ts the data for any number of other children
for whom detailed phonetic lists of early words have been provided in the
intervening years. Macken (1996) indicates further that she sees word templates
as being identiable through the typical overgeneralization and conspiratorial
effects of the several rules that operate to produce [a particular] output e.g.,
metathesis (plus harmony) . . . , consonant epenthesis . . . , unusual deletion of
the input medial stressed V . . . (p. 169).
How solid, and how cross-linguistically valid, is the empirical basis for the
whole-word phonology idea in language development? The three arguments
that have been primarily used to support the concept are as follows:
1. Variability of segment production: A child may produce the same sounds
differently in different words, and some words may be more variable than
others. This suggests that the child has knowledge of particular words but
has not yet developed abstract categories of sounds for production (Ferguson
and Farwell 1975).
2. Relationship of child word to adult target: The relation of early child words
to their adult models is often found to be difcult to account for on a
segment-by-segment basis. Instead, the child seems to be targeting a whole
gestalt (Waterson 1971). The resulting patterns have been described as
whole-word processes, sometimes characterized as either harmony
(assimilation of noncontiguous vowels or consonants) or melody (patterning
in the sequencing of noncontiguous vowels or consonants) (Grunwell 1982;
Macken 1992, 1995; Vihman 1996).
3. Relationship between child words: The interrelation between the childs own
words may be more evident than the relation to the adult models (Macken
1979). This is due to the childs eventual reliance on one or more word
templates, specic phonological patterns which t many of the words that
the child attempts (these words are said to be selected), but which are also
extended to words that are less close to the template (these words are then
adapted to t the template [Vihman and Velleman 2000]).
An additional argument can be proposed, with reference to the apparent basis
for developmental patterning that is distinct from the phonology of the adult
language:
4. Source of child patterns: The dominant child patterns of the early word
production period are responses to challenges posed by adult target words,
primarily, the challenge of producing distinct consonants or distinct vow-
els, or both, in different syllables or different word positions (i.e., initial
and nal consonants in a monosyllable, as in Daniel Menns forms, cited
above).
We will provide no specic developmental evidence here in relation to (1), the
variability in production of the same segment in different words, but such
evidence can be obtained from the more detailed of the various single-case or
small-group studies cited (see also Section 3.1 below). The evidence to be
provided in Section 2.2 (as well as in Table 2.2 above), based on data from
individual children, will serve to illustrate the remaining arguments, which are
complementary. Finally, we will indicate some of the differential effects of
ambient language rhythmic patterning on the shapes of early child templates in
Section 2.3, where we provide cross-linguistic data based on three to ten
children per language group.
The nature of the challenge that early word production poses to children has
yet to be satisfactorily established. Some have argued that the challenge is
primarily representational (memory difculties: see Vihman 1978; Macken
1979, among others) or articulatory (production difculties: Labov and Labov
1978; Studdert-Kennedy and Goodell 1995, among others); both speech
planning (Chiat 1979) and speech processing (Berg and Schade 2000) have
also been identied as plausible bases for childrens problems. Although
infants are known to have remarkable capacities for perceptual processing
(specically, for segmental discrimination) from the earliest months, so that
perceptual problems per se might seem an unlikely source of difculty,
3
it
has become increasingly clear that the deployment of these capacities in
relation to the discrimination of minimally distinct word forms requires addi-
tional attentional resources, at the very least, and constitutes a novel task for
one-year-olds (Stager and Werker 1997; Werker, Fennell, Corcoran, and
Stager 2002). Thus some combination of attentional or representational fac-
tors may be involved, although differences in motor control and practice must
also affect differences in production (McCune and Vihman 2001).
2.2. Evidence for word templates in early phonological development
In the earliest period of acquisition the idea of structure emerging from known
holistic phonological units can be demonstrated in its simplest, most direct
form. Menn (1971) observed that early phonological patterning is partly
determined by the shapes of the rst handful of words attempted (p. 246).
Later studies have made it clear that, contrary to Jakobsons (1941/1968) well-
known discontinuity view, the source of the shapes of the rst words is often
to be found in prelinguistic vocal practice, or babbling (Stoel-Gammon and
Cooper 1984; Vihman, Macken, Simmons, and Miller 1985; Vihman and Miller
1988; Elbers and Wijnen 1992; Vihman 1992; McCune and Vihman 2001), with
some effects of the ambient language on vocal production being identiable
even before rst word production (Boysson-Bardies, Hall, Sagart, and Durand
1989; Boysson-Bardies and Vihman 1991; for comparable effects in the seman-
tic domain, see Bowerman and Choi 2001).
The earliest word forms are thus typically closely related to the individual
childs babbling patterns (Vihman et al. 1985) as well as being relatively
accurate (Ferguson and Farwell 1975), and they may show strong selection
constraints (Ferguson, Peizer, and Weeks 1973; Schwartz 1988). That is, it is
often apparent that only a small range of the many possible adult word patterns
are attempted, with certain phonetically accessible forms characterizing most of
the rst words produced. Such forms include particular phonotactic shapes
or prosodies (CVCV, VCV, or in some cases CVC); forms with a limited
range of onset consonant types (stops, nasals, glottals, and glides); forms with
only a single consonant type; forms including only low or front vowels,
especially in the rst syllable; and forms involving associated CV sequences,
such as labial + /a/ or schwa, alveolar + front vowel, velar + back vowel (Davis
and MacNeilage 1990, 1995, 2000, 2002).
Although direct experimental evidence remains limited (but see Vihman and
Nakai 2003; DePaolis 2006), there is reason to believe that the earliest word
forms are the product of implicit infant matching of own vocal patterns to input
patterning (Vihman 1993, 2002b). This would account for the ndings of relative
accuracy and of phonologically constrained selection. Arst lexicon of some ve
to ten identiable, spontaneously produced adult-based words would be the result
of that match. As a result, the earliest word forms of children acquiring different
languages are broadly similar (with limited phonotactic shapes and consonant
and vowel patterns, as indicated above), being rooted in the physiological con-
straints that govern vocal production in the babbling and rst word period (Locke
1983; Locke and Pearson 1992; Davis and MacNeilage 1990, 1995, 2000; Kent
and Bauer 1985; Kent 1992; see Vihman 1996, Appendix B, which presents the
rst few words of 27 children acquiring seven different languages; as well as
Tables 2.6a, 2.7a, 2.8a, and 2.9a below, which also sample the rst word forms of
children acquiring different ambient languages).
Within these biologically given limits, however, the ambient language shapes
the rst phonological patterns or templates, which emerge out of the rst words
as the child begins to target new word forms beyond his or her existing range,
sometimes selecting minimally new adult patterns to attempt, sometimes adapt-
ing more distant adult patterns by imposing an existing pattern on them
(Vihman and Velleman 2000). Whereas the rst words are individual by child
but broadly similar cross-linguistically, the templates that are then induced from
them, signaling the rst phonological organization, reect language-particular
differences to a limited extent, as we will illustrate below.
Individual synchronic patterns from children learning a wide range of
languages have provided evidence of word templates, with or without making
reference to whole-word phonology (for examples, see Berman 1977 [Hebrew/
English]; Macken 1978, 1979 [Spanish]; Vihman 1993 [French]; Vihman and
Velleman 1989, Vihman, Velleman, and McCune 1994b [English]; Vihman and
Velleman 2000 [Finnish]; in addition to the children whose data are presented
here). Tables 2.32.5 add to the sample in Table 2.2 with examples from
Vihmans son Raivo, acquiring both English and Estonian, Watersons son P,
and another Estonian-learning child, Madli; note the similarity of the Estonian
data in Tables 2.3 and 2.5 to Watersons data (Tables 2.2 and 2.4).
Table 2.3. Raivos early word templates: nasal structure
(Estonian; age 1;3.181;3.24) nN (N = any nasal) (adapted
from Vihman 1981)
[in(+)], [n(+)] (im.); [n] lind bird
[nnn], [nn] rind breast (nursing)
[nni], [n], [n], [nn] king shoe
[ni], [ninin], [ni] kinni closed
+ indicates several repetitions of the syllable in production; im. = imitation
Table 2.4. Ps early word templates: sibilant
structure (age 1;6) (stop)V (adapted from
Waterson 1971)
[by] brush
[di] dish
[i] fetch
[i], [] sh
[] vest
No segmental substitution account could do justice to these data or capture
the systematicity apparent here. This was the point that Waterson was making in
1971; the little word groups or schemas that she identied when her son P had
roughly 150 words turn out to roughly characterize Madlis and Raivos
Estonian early word patterns as well.
Three types of clues are generally used to identify a childs word template(s):
(a) Consistency of patterning in a substantial number of the child forms for
words produced in one or more recording sessions or over a period of some
weeks or months;
(b) The occurrence of unusual phonological correspondences between adult and
child forms (i.e., rules or processes or repairs to target word violations of
child constraints), under the inuence of a dominating pattern or template;
(c) Frequently, a sharp increase in words attempted that either t or can be tted
into the pattern.
Given these criteria, it is clear that such patterns are most reliably identied on the
basis of longitudinal data from the same child, as Macken (1996) emphasized.
The systematicity in a childs early word production tends to be evident only after
the child has produced some critical number of word forms. The number of forms
will vary from one child to the next, since the emergence of a systematic word
production plan or template depends on the child inducing this structure from the
words s/he is able to say. For example, Menn 1971 observed:
using hindsight, only 3 of [Daniels rst] 30 words fail to satisfy the constraints reected
by the rst set of phonotactic rules, those which govern stage 2 . . . One is led to the
opinion that, while phonotactic rules have not yet crystallized in stage 1, something
vaguely systematic, from which the rules will develop, is at work. (Menn 1971: 231f.)
A developmental progression can thus characteristically be tracked in longitu-
dinal studies of individual infants, from relatively accurate (but highly
constrained) earliest word forms to systematically adapted (and thus sometimes
less accurate but wider ranging) later forms. To illustrate this progression
Table 2.6 presents data from a case study of a child acquiring German in a
monolingual context (Elsen 1996). Here and in what follows we will distinguish
Table 2.5. Madlis early word templates (Estonian;
age 1;8) (p, t)Vs (adapted from Krgvee 2001)
[is] isa, issi daddy
[as] kass cat
[pis] piss pee
[us] suss slipper
[tis] tiss teat
[us] uss snake
Table 2.6. Developmental progression in rst words (Annalena: German).
CV(C1V1); Vi; labialalveolar as phonological patterns, rst fty words
(data from Elsen 1996)
a. Select only (810 mos.)
Child form Adult target Characteristic pattern (based on later template)
[da] da there CV
[ba] Buch book CV
[ai] ei! (fondling expression) Vi
[ai] Ei egg Vi
[nain] nein no Vi
[mama] Mama mama CVCV: CH + VH
[baba] Papa papa CVCV: CH + VH
[pipi] pieppiep mouse CVCV: CH + VH
[dd] Teddy CVCV: CH + VH
[data] das da that one there CVCV: CH + VH
[bita] bitte please lab C . . . alv C
CH = consonant harmony; VH = vowel harmony; MET = metathesis, RED = reduplication;
TRUNC = truncation
b. Select + adapt (1012 months)
Select Adapt
Child form Adult target Template Child form Adult target Template
[ja] ja yes CV [ba] Wasser water CV
[bi] Bild CV
picture
[de] Tee tea CV
[d] Zeh toe CV [bai] Wasser water CV + Vi
[hai] heiss hot Vi [oi] oh! Vi
[ba] Baum tree V [ail] l oil Vi + Vi
[ail] Eule owl Vi, Vl
[pp] tt toot CVC: CH [mom] Baum tree CVC: CH
(blow nose) [note regression]
[mom] bong! CVC: CH
[kiki] kikeriki cock-
a-doodle-do
CVCV: CH + VH [nana] Zahn(brste)
tooth(brush)
CVCV: CH
MET + RED
[pipi] Pipi peepee CVCV: CH + VH [nana] Annalena CVCV: CH
TRUNC + MET
[nan] Banane
banana
CVCV: CH + VH [dada] Tag (good)day CVCV: RED
[bebi] Baby CVCV: CH [vava] wauwau
bowwow
CVCV: CH
the rst words, which we term selected (these are the early words in which
something vaguely systematic . . . is at work), and the later words, which may
be either adapted (e.g., the velar harmony words produced by Daniel as his
phonotactic rules began to operate) or selected, in cases in which the adult
word targeted already ts the childs existing phonotactic constraints or word
template.
We have organized the words according to their patterning, primarily their
phonotactic patterns. In the rst months of word production we nd simple
monosyllabic Ca patterns (with initial stop: da, Buch), VV and CVVC
(with the rising diphthong [ai]: ei!, Ei, nein), CVCV (with both consonants
and vowels agreeing across the two syllables: Mama, Papa, pieppiep, Teddy,
das da), and a single C
1
V
1
C
2
V
2
pattern, with a labial alveolar sequence
(bitte). The childs forms are closely related to their adult targets; in Ferguson
and Farwells terms, they are fairly accurate, although we nd some omission
of syllable-nal consonants and two instances of vowel change ([ba] for buch,
[dd] for Teddy).
4
In the following two months, as the pace of word learning quickens consid-
erably (some forty new words are added), we nd (under select) all of the
same patterns represented, with some loosening of the constraints apparent in
the earlier words. The CV patterns include newvowels and an initial glide; the
diphthong [a] occurs as well as [ai]; new syllables occur in harmonizing
disyllabic words. In addition, there are two new phonotactic shapes for
words VCV and CVC. It is notable that the CVC syllables, the only
word forms with differing C
1
vs. C
2
, either show consonant harmony or retain
the previously represented sequence labialalveolar. Under adapt, moreover,
we nd essentially the same word shapes and sequential constraints but with
more radical departures from the adult model.
Table 2.6. (cont.)
Select Adapt
[babi
d
] Papier paper CVCV: CH [baba] Bauch belly CVCV: RED
[i] trinken CVCV: CH
to drink
[ata], [ada] ada bye VCV [aa] essen to eat VCV
[man] Mann! CVC: [bal] Lampe lamp CVC: lab . . . alv
oh boy! lab . . . alv MET
[man] Mann man CVC: [bl] Brille glasses CVC: lab . . . alv
lab . . . alv
[bal] Ball ball CVC:
lab . . . alv
CH = consonant harmony; VH = vowel harmony; MET = metathesis, RED = reduplication;
TRUNC = truncation
One way of conceptualizing the childs adapted forms is to see them as
the result of the child (implicitly) imposing one or more preexisting tem-
plates, or familiar phonological patterns, on an adult form that is sufciently
similar to those patterns to serve as a hook. From this perspective, we can
see the effects of the childs practice or motoric familiarity with redupli-
cated patterns (resulting in [nana] for Zahnbrste and [baba] for Bauch, for
example) and with the diphthong [ai], which now appears unexpectedly in
adult words that lack it (e.g., Wasser, oh!, l). Note that the child has
consistently produced only C1C2 sequences involving labials followed
by alveolars (see bitte among her rst words, Mann, Ball, Brille among her
later words), this also being the presumed motoric-plan basis for the meta-
thesis of Lampe to [bal]. Thus, from a usage-based perspective, the childs
adoption of the pattern [bal] (identical to her production of Ball) for Lampe
is not surprising, despite the fact that it involved both (1) omission of the
nal vowel and medial nasal and (2) rearrangement of the syllable-onset
consonants.
In these data, then, we can see evidence of a shift fromthe exclusive production
of words that deviate very little from the adult model to words that may deviate
quite markedly, and in different ways for different words, with the result that
certain patterns are heavily overrepresented in the childs surface forms. In
general, the childs changes affect whole-word forms, not individual segments,
and a number of word templates or well-practiced patterns can be identied,
some of them acting jointly in certain cases (CVC + labialalveolar, for
example).
In Table 2.7 we see the rst words of a child (Virve) acquiring Estonian but
with some exposure to English as well (Vihman 1976).
This child began talking early, although not as precociously as Annalena.
Her early word production suggests tightly constrained phonological selection,
in that words attempted as well as word forms produced were restricted to
(1) a limited segmental inventory (labial and alveolar stops, [s], glides and
glottals), (2) constrained word shapes such that only a single consonant type
could occur anywhere in the word ([tete] for tere), and (3) constrained vowel
sequencing as well (lower vowel rst, higher vowel second). Note that three of
Virves rst six recorded words include the diphthong [ai], the same diphthong
favored by Annalena.
In the following two months of rapid lexical advance Virve loosened con-
straints on possible word forms step by step, as illustrated in Table 2.7b. First
manner ([tin] for kinni),
5
then place (Manni) were allowed to vary, but not both.
Within the vowel sequences, similarly, we see a consistent tendency to produce
either harmonizing forms or V(. . .)i/u patterns, these word forms being
supported by the adult models listed under select but imposed on the models
listed under adapt.
Although the nal /i/ pattern is also commonly found in English (e.g., Molly,
in Vihman and Velleman 1989; Alice, in Vihman et al. 1994a; and the subject
Table 2.7. Developmental progression in rst words (Virve: Estonian [and English]) (Vihman 1976).
a . . . i or V
1
. . . V
2
= low non-low
a. Select only (1012 months)
Characteristic pattern
(as identied in later template)
[hai] hi CVV: Vi
[pai] pai nice CVV: Vi
[aita], [aida] aith /aith/ thanks VV(CV): Vi
[ao] allo hello (into telephone) VV: Vo
[se] see this CV
[te], [tee], [tete] tere hello CV(CV)
Adult Estonian words have initial stress unless otherwise noted. CH= consonant harmony; MET= metathesis; VH= vowel harmony
Select Adapt
[titi] kikerikii
cock-a-doodle-do
CVCV: CH, VH [asi] isa father VCV: V
1
. . . V
2
(i) MET
[ap] habe beard VCV [ami] [ani] ema mother VCV: V
1
. . . V
2
(i) MET
[k k ] cookie, cracker CVCV: CH, VH [ati] liha meat VCV: V
1
. . . V
2
(i) MET
[tin] kinni closed C
1
VC
2
[ta | ti] lahti open CVCV: V
1
. . . V
2
(i) CH
[tata], [tai] tdi /tti/ auntie CV(CV): CH, Vi [tati] kallikalli hug CVCV: V
1
. . . V
2
(i) CH
[pebi] beebi baby CVCV: CH, V
1
. . . V
2
(i) [papu] bravo CVCV: V
1
. . . V
2
(high V) CH
[api] appidu uppy-do (jump) VCV: V
1
. . . V
2
(i)
[pai] bye CV: Vi
[ta | si] tantsi dance CVCV: V
1
. . . V
2
(i)
[atsi(h)] tsih achoo VCV: V
1
. . . V
2
(i)
[mani] Manni (name) CVCV: V
1
. . . V
2
(i)
[pawawei] papagoi parrot CVCVCV: Vi
Adult Estonian words have initial stress unless otherwise noted. CH = consonant harmony; MET = metathesis; VH = vowel harmony
of Davis and MacNeilage 1990) and can plausibly be related to the high input
frequency of diminutives such as baby, doggie, kitty, nappy, etc., it is not
necessary to invoke English inuence as a source of Virves patterns.
Table 2.8 presents all the disyllabic words attempted among the rst fty
words of a monolingual Estonian-learning child, Eeriku (Salo 1993).
Like Virve, Eeriku generally avoided the vowel sequence non-low low
(that is, he observed a sequential constraint on vowel height, which we term
SEQ) as well as nonharmonizing frontback vowel sequences (F/B), adapting
words which fail to meet those constraints by the use of truncation (TRUNC)
and metathesis (MET) as well as vowel harmony (VH). As can be seen in
Table 2.8a, the rst few longer words that Eeriku attempted had low vowels
only or were truncated to eliminate the second vowel. Word (12), isa daddy, is
the only word that violates SEQ until the very last few words produced in this
period, which covered a full year in Eerikus case. Eeriku showed a highly
unusual afnity for the difcult Estonian consonant (trilled) /r/. Of his rst 50
words 13 include an /r/; in several cases he appears to truncate specically in
order to produce a syllabic or coda /r/. Otherwise, the adaptations of adult
targets included in Table 2.8b all seem to conspire to achieve a vowel sequence
that violates neither SEQnor F/B(for each word we have indicated the violation
avoided in italics).
Finally, in Table 2.9 we see the same developmental progression that was
illustrated in Tables 2.62.8, this time based on data from a child acquiring
English, though with some exposure to Spanish (Alice: Jaeger 1997), and
starting on her rst word production at 18 months, several months later than
the two children discussed in some detail so far.
Alice again shows only minor changes from the adult model in most of her
rst words (select only). The child forms for food, bottle, and doggie con-
stitute an exception: Jaeger notes that these unusual phonetic forms, which were
produced with a strongly nasal release of the medial obstruent, correspond to
one of this childs frequent prelinguistic babbling patterns.
However, by ve months later, when Alice had acquired a lexicon of some
100 words, she had developed a striking word-form constraint or template,
restricting unlike consonants to a front-before-back sequence. This led to
extensive changes to some adult words (adapted), while other words showed
only minor consonant or vowel substitutions (selected). The constraint was
pregured by 6 (out of a total of 22) earlier words, bottle, mine, doggie, this,
and, at 2021 months, block, stocking). At 23 months the only exceptions to the
constraint were the words dummy, jump, and tum one of only two exceptions
to the constraint among Alices rst words. It seems likely that the exceptional
status of all three words at the later stage stems from entrenchment due to the
frequent use Alice made of this form in a period of great lexical expansion.
While living temporarily with her grandparents, from 1;9.15 on, she called both
of them [tm] for a few days.
b. Vowel sequences admitted (but low non-low preferred)
Select (target vowels t pattern) Adapt (target vowels violate pattern)
Child form Adult target Relation of target to template Child form Adult target
Adaptation Relation of target
to template
[isa] isa daddy (12) Violates SEQ and F/B [tr:u] toru, torud pipe, pipes
(14, 15)
[produce r]
[a:u] halloo! (24) [mum:] muna egg (16) TRUNC Violates SEQ
[pa:p:a] papagoi parrot (30) VH [ame] ema mother (17) MET
Violates SEQ and F/B
Table 2.8. Developmental progression in rst words (Eeriku: Estonian) (Salo 1993) From vowel harmony (VH) constraint to
sequential constraint V
1
. . . V
2
= low non-low (SEQ) or front/back harmony (F/B)
First fty words: 1;52;5. All (non-onomatopoeic) multisyllabic target words are listed below, along with the childs word form. Numbers
in parentheses refer to the order of rst production of these forms.
a. No vowel sequences allowed
Child form Adult target Template Child form Adult target Adaptation
[ppa] pkapikk elf (3) CVCV: RED [tit] tita child (4) TRUNC
[paba] paber paper (5) CVCV: RED [en:] onu uncle (6) TRUNC
[ana] vanaema grandmother (9) VCV: VH [:] vike little (8) TRUNC
MET = metathesis, RED = reduplication; TRUNC = truncation
Table 2.8. (cont.)
Child form Adult target Relation of target to template Child form Adult target
Adaptation Relation of target
to template
[aith] aith thanks (33) Violates F/B [pop:] potsataja fairy tale animal (18) TRUNC
Violates SEQ
[istu] istu sit! (37) Violates F/B [amo] homme tomorrow (19) MET
Violates SEQ and F/B
[arstd] arsti(-)tdi doctor-auntie (38) Violates F/B [aut] auto car (20) TRUNC
Violates SEQ
[priv] prillid glasses (40) TRUNC (despite VH in target) [trar] traktor tractor (21) TRUNC [produce r]
[bi] kbi pinecone (41) [o:ro] koori peel (23) VH
Violates F/B
[sin:a] sinna to there (45) Violates SEQ and F/B [trr] terita- sharpen (pencils) TRUNC [produce r]
[sis:e] sisse to inside (46) Violates SEQ [o:t] oota wait (32) TRUNC Violates SEQ
[pe] pike sun (47) [or:] orav squirrel (36) TRUNC Violates SEQ [produce r]
MET (rst two syllables)
Violates SEQ
[ara] hari brush (42) VH
Violates F/B
[pe] pea head (43) TRUNC
Violates SEQ
[avr] Aivar (44) TRUNC
[produce r]
[todo] Tota-tdi Auntie VH
Tota (49) Violates SEQ
MET = metathesis, RED = reduplication; TRUNC = truncation
2.3. Prosodic/segmental interactions and ambient language inuence
So far we have looked at longitudinal data from three children, each acquiring a
different language, as well as at sample word patterns from a few additional
children acquiring English and Estonian. We have seen that some patterns occur
cross-linguistically and that the early segmental types children produce tend to
be similar regardless of the language to which the child is exposed. Some
patterns do differ by ambient language, however. In this section we illustrate
the effect of the ambient language on early child word patterns by considering
Table 2.9. Developmental progression in rst words (Alice: English) (data
from Jaeger 1997). C1 C1 or fronting constraint: labial alveopalatal,
labial velar, alveopalatal velar
a. Select only (1819 months)
Child form Adult target Child form Adult target
[mama] mommy [hai], [ai] hi
[tata] daddy [aw] out
[nana] Anna [(p)pai] byebye
[peipi] baby [tm] music: tum(te-tum)?
[kta] look at that [main] mine
[kak] food: cracker/cookie? [ti] this
[papm] bottle [mm] no: mm-mm
[tak] doggie [o] uh-oh
Child form Adult target Child form Adult target
[ptu] butter [pita] MET David
lab alv alv lab lab alv
[tik
h
] cheek [taik] MET kite
alv vel vel alv alv vel
[pak
h
] frog [pi] MET sheep
lab vel pal lab lab pal
[ppi] puppy [pu] MET soup
lab lab alv lab lab alv
[ti] teeth [piti] MET TV
alv pal alv lab lab alv
Exceptions (based on entrenchment of [tm]?)
[tm] dummy
[tmp] jump
[tmi] tum music
MET = metathesis
no onset, or child omission of word-initial consonants. This pattern is dis-
favored by markedness constraints: CVis the most widely occurring syllable
pattern, universally, and is also the rst adultlike syllable infants produce (at
about 68 months [Oller 1980, 2000]). However, as we shall see, the accentual
pattern of the adult language renders some segmental positions more salient
than others, so that although the omission of initial consonants occurs only
rarely in English child words, it is far more common in other languages. We will
summarize some evidence to this effect and will then consider how differences
in adult language accentual patterning might result in this difference in early
child word patterns.
In a study of Finnish children acquiring geminate consonants Vihman and
Velleman (2000) were surprised to nd that the second most common child
phonological pattern (after consonant harmony) was no onset (31 percent,
both selected and adapted) a pattern considered to be a mark of deviant
phonology in English (see also Savinainen-Makkonen 2000). Subsequent
analyses of data from children learning other languages suggest that it is the
absence of any such pattern in data from English-speaking children that is
unusual. Table 2.10 shows the proportion of initial consonant omission in
selected and adapted word forms for each of ve languages.
The column labeled % select shows the mean proportion of the childrens
forms that are based on adult words (or phrases) that fall into the no onset
pattern. Although Finnish has the highest proportion, the languages are roughly
evenly distributed across the range, from 12 to 24 percent. The column labeled
%adapted shows the incidence of child forms in which an initial consonant of
the adult form has been omitted (a pattern seen in some earlier tables as well).
6
Here we see that four of the ve languages cluster closely together, with
incidence of initial target consonant omission ranging from 14 percent to 16
percent. Only English, in accordance with what has generally been taken to be
the universal norm, shows a very low incidence of initial consonant omission
(4 percent); see Figure 2.1.
Table 2.10. Initial consonant omission in ve languages
a
Language (N children) % select Language (N children) % adapt
Finnish (11) 23.9 French 16.4
Estonian (3) 22 Welsh 16
French (5) 15.4 Finnish 14.9
Welsh (5) 13 Estonian 14
English (6) 11.8 English 4.3
Mean 17.04 13.12
a
Data from the case study of Sini, a child acquiring Finnish (Savinainen-Makkonen
2001), and from Andrew, a child acquiring British English (French 1989), have been
added to the data cited in the footnote to Table 2.1.
Thus, a similar proportion of target words and phrases lack an onset conso-
nant in all ve languages (based on words selected), but the children are less
likely to adapt target words by omitting an onset consonant in English than in
any of the other languages. We must look beyond the basic segmental structure
of the language to account for this.
The languages differ in their accentual patterns, especially their rhythmic
patterns. In English the dominant trochaic pattern is manifested, phonetically, in
a longer and louder rst syllable (which may also be higher in pitch) and a
reduced second syllable (Vihman, DePaolis, and Davis 1998; Vihman, Nakai,
and DePaolis 2006). In none of the other languages do these factors jointly
No onset (selected)
40
%

o
f

a
l
l

w
o
r
d

t
y
p
e
s
35
30
25
20
15
10
5
0
Finnish (11) French (5) Welsh (5) English (6) Estonian (3)
No onset (adapted)
40
%

o
f

a
l
l

w
o
r
d

t
y
p
e
s
35
30
25
20
15
10
5
5
0
Finnish (11) French (5) Welsh (5) English (6) Estonian (3)
Figure 2.1. No onset (selected vs. adapted) in ve languages
affect the rst syllable, despite the fact that in our sample all but one of the
languages is primarily or exclusively trochaic. In French the dominant pattern is
iambic, with lengthening of the nal syllable as the primary accentual marker. In
Welsh, although the rst syllable of a disyllable is normally stressed, this is
manifested by a short rst-syllable vowel followed by a lengthened medial
consonant and a long second vowel (see Vihman et al. 2006 for documentation
of both adult and child production). Finnish, although strictly and exclusively
trochaic, has another highly salient rhythmic characteristic frequently occur-
ring medial geminates which can deect infant attention away from the initial
consonant. Indeed, the presence of medial geminates appears to be a powerful
attractor for infant attention, since children target a disproportionate number
(49 percent, compared to an incidence in mothers content words of 37 percent)
(Vihman and Velleman 2000). In the childrens own productions, 55 percent
have long medial consonants, again suggesting attention to and overextension
of this rhythmic property.
Here then we see group results analyzed in the same way as the longitudinal
data presented in Tables 2.62.9 above. A similar proportion of VCV patterns
occurs in the input in all ve languages (mean of 17 percent), based on child
selection of words to attempt that lack an initial consonant (e.g., English uh-oh,
Table 2.9). In the case of all of the languages except English the children extend
the pattern to assimilate word targets falling outside it in the adult language.
In some cases the omitted consonant itself poses a problem for the child
(see Table 2.4, in which P, learning English, systematically omits initial frica-
tives). In most cases, however, omission of the initial consonant appears to be a
way to arrive at a pronounceable form despite the difculty posed by a word-
internal noncontiguous consonant sequence. This is a striking demonstration of
the effect of the whole-word (disyllabic) pattern on learning, since it is the
lengthening of a medial consonant or nal vowel, or both, which appears to
draw the childs attention away from the initial segment, typically considered
most critical to word learning in English.
As further evidence for the hypothesized role of geminates in supporting a
no onset template, Table 2.11 summarizes the phonological patterning in the
complete lexicon of a child V, aged 1;7, who is bilingual in Hindi and English
(with a few words from other Indian languages).
One example of each occurring pattern is provided; numbers in each cell
indicate the total child word form types conforming to the pattern (T = 198
words).
This child primarily produces monosyllables in English (83 percent far
exceeding the mean seen in other children acquiring English as well; see
Table 2.1) but disyllables in the Indic language words he knows (78 percent).
Indeed, the author/diarist sees the childs differential attention to English
monosyllables vs. Hindi disyllables as Vs way of keeping the languages
apart in a setting in which several languages are current and code mixing
is the rule. Vs English words also tend to show consonant harmony (15/41,
Table 2.11. Consonant harmony and no onset in a bilingual child, V (1;7) (based on Bhaya Nair 1991)
English Hindi (+ a few Bengali and Malayalam words)
Phonological pattern Select Adapt Select Adapt Total word types
CV(V) 7 no 1 ball [b:] 4 /ta/ tea 1 /phu:l/ ower [pu:] 13
V(V)(C) 1 eye 0 4 /a:g/ re 0 5
C
1
VC
1
(or place agreement only) 3 cake 12 dog [kg] 0 1 /na:k/ nose [ka:k] 16
C
1
VC
2
10 bus 0 2 /ka:n/ ears 1 /gram/ hot [gm] 13
C
1
VC
1
V 2 dirty 0 6 /ba:ba/ grandpa 0 8
C
1
VC
2
V 1 bowwow 0 3 /k
h
ta/ thorn 0 4
VCV 0 3 cover 5 /a:pa/ aunt 7 /pa:ni/ water [a:ni] 15
VCCV 6 /nda/ egg 13 /khi/ comb [hi] 19
C
1
VC
1
C
1
V 1/i/ excrement 0 1
C
1
VC
1
C
1
VC
1
2 /ti:tti:t/ sweet 0 2
C
1
VC
2
C
1
VC
2
1 ticktick 2 /ptpt/ beating 0 3
Total 25 16 35 23 99
or 37 percent) while his Hindi words tend to show no onset instead (35/58, or
60 percent). Interestingly, three of his English words also showinitial consonant
omission: [b] cover, [ki] monkey, [t] water a probable sign of inter-
action with the Hindi pattern, since such a pattern seems highly unusual for
English words whose initial consonants are a stop, a nasal, and a glide.
Of the initial consonants omitted in non-English words, 6/20 are affricates or
/ / or /r/, segments the child does not yet produce or produces only rarely.
(Four English, three Hindi and one Bengali word are produced with initial
affricates; none have initial / / or /r/.) Yet segmental difculties are not the sole
or primary basis for no onset since in three cases the omitted consonant is a
stop or nasal that agrees in full or in place only with the medial consonant. Of
the child words that differ from their targets by virtue of initial consonant
omission, 13 out of 20 (65 percent) have a medial consonant cluster; 8 of
these (40 percent) are geminates. Thus, the medial long consonants are as
plausible a rhythmic source of the no onset: adapt pattern here as in Finnish.
2.4. Universals of early phonological development or inductive
generalizations from the lexicon?
We have considered the emergence of word templates in the course of rst word
production as recorded in several diary studies. The templates cannot be innate,
since they are not always present from the rst words, nor can they be universal,
since they differ from one child to the next and also differ to some extent by
ambient language.
Rather, we take them to be the emergent product of three sources of phono-
logical knowledge for the child: (1) familiarity with the segmental patterns
typical of the adult language, which advances steadily over the last few months
of the rst year (see Jusczyk 1992, 1997); (2) developing motoric control and
familiarity with a subset of adultlike phonological patterns due to production
practice (babbling); and (3) increasing familiarity with the structure implicit in
the childrens own rst lexicon. The childs early word forms can be taken to
reect sensitivity to matches between his or her emergent production patterns
and frequently used adult words. The wide interchild variability in early pho-
nological patterning that we see even within the limits of a single ambient
language does not derive from the adult input, however, but from the individual
lter that each child brings to the word learning process. This is evident from
the fact that while the phonological patterns found by sampling input from ve
mothers are strikingly similar, those of their ve children are widely different
(see Vihman et al. 1994a, which replicates the nding in three languages,
English, French, and Swedish).
We take the fact that cross-linguistic differences shape word templates to be a
natural consequence of the induction process, since the target lexicon necessa-
rily shapes the patterns implicit in the childs rst 50 words or so. We note that
English, Estonian, and German data often show a concentration of CVC shapes
(see also Vihman and Velleman 1989). In contrast, French data do not normally
show CVC forms as early as the rst 50100 words (Vihman 1993, 1996),
although the EnglishFrench bilingual early words reported by Brulard and
Carr (2003) do include such forms, and they dominated the English lexicon of
the child V, as indicated in Table 2.11. These diary studies provide some insight
into the construction of templates under conditions of bilingual input (Vihman
2002a).
In short, we see the earliest phonological organization as constituting an
inductive generalization based on the childs rst repertoire of phonetic patterns
and their interaction with the phonological structure implicit in the words of the
ambient language that the child is attempting to reproduce. The phonological
organization itself inheres in whole-word patterns or word templates, as can be
seen from the adapted patterns illustrated above. Phonological categories will
gradually emerge later, in different ways for different children. The develop-
mental pattern is like that found in recent studies of early syntax, in which
verb islands are found in lieu of abstract grammar, with productive use of
subcategories emerging only slowly, in different ways for different children
(e.g., Tomasello 1992; Lieven, Theakston, Pine, and Rowland 2000).
3. From child to adult: toward a radical templatic phonology
In Section 2, we argued for a templatic approach to phonological development
in the child. In this section, we argue that a templatic approach is equally suited
to the analysis of adult phonology. This argument derives much from phoneti-
cally oriented, exemplar and usage-based approaches to phonology and from a
related approach to syntax, Radical Construction Grammar (Croft 2001).
3.1. Variation and phonological categories
One of the initial arguments for a templatic approach to child phonological
development is the variability of segment production. Such variability is
pervasive in adult phonological categories as well. Ohala writes, One of
the major discoveries of phonetics for the past century is the tremendous
variability that exists in what we regard as the same event in speech, whether
this sameness be phones, syllables, or words (Ohala 1993: 239). Ladefoged
and Maddiesons (1996) survey of segments across languages documents this
variability on virtually every page. Pierrehumbert, in a paper advocating an
approach to phonology that is quite similar to ours, also begins by demon-
strating the high degree of variation found not just in segments but also in
prosodic structures (Pierrehumbert 2003a: 1207; see also Pierrehumbert,
Beckmann, and Roberts 2000).
This variability occurs at all levels, fromindividual usage events to languages
(that is, cross-linguistic variation). For example, vowel productions are stand-
ardly mapped onto a two-dimensional F1F2 space, and scatter plots illustrate
variation in production in usage events within and across individuals (e.g.,
Pierrehumbert 2003b), leading to sociolinguistic variation (e.g., Labov 1994).
Ladefoged and Maddieson (1996) document this variation as it eventually
manifests itself as divergence across dialects and across languages. For exam-
ple, at the dialect level, Californian English speakers use true interdentals in a
word such as [ ik] whereas British English speakers use a dental fricative
[ik] (p. 20). Cross-linguistically, many languages distinguish dental and
alveolar stops, particularly in India, Australia, and the Americas. Most such
languages contrast a laminal dental [t] vs. an apical alveolar [] as in Toda [pot]
ten vs. [pt] cock-roach, but Temne contrasts an apical dental vs. a laminal
alveolar (Ladefoged and Maddieson 1996). Most such languages also have
greater affrication of apical alveolars than laminal dentals, as in Isako, but
Dahalo has greater affrication of the laminal dentals (p. 25).
Variation is so pervasive that an adequate theory of phonology cannot ignore
it or properly abstract away from it (see Section 1). Pierrehumbert (2003a)
argues for an approach to phonological categories based on mathematical
psychology that accommodates variation:
A category is a mental construct which relates two levels of representation, a discrete
level and a parametric level. Specically, a category denes a density distribution over
the parametric level, and a category system denes a set of such distributions.
Using the density distributions for categories in a category system, incoming signals
may be recognized, identied, and discriminated through statistical choice rules. This
understanding of categories has been generally adopted in experimental phonetics and
sociolinguistics. (Pierrehumbert 2003a: 119)
We believe that this approach to categories can and should be adopted in
phonology as well.
One result of this approach to categorization is that the segment categories
that can be formed from the actual input are not phonemes but positional
variants of phonemes (Pierrehumbert 2003a: 12930). For example, tokens
of initial and nal /s/ in English differ from each other signicantly. Within
each position, /s/ and /z/ are reasonably well differentiated, but across posi-
tions, there is substantial overlap between /s/ and /z/ tokens. Pierrehumbert
(2003a: 140) concludes that the engine of adult speech perception appears to
be positional segmental variants. Pierrehumberts conclusion is exactly that
of our templatic approach: segmental phonological categories are dened
in terms of their position in a larger structure (the word template; see
Section 3.2). The evidence that Pierrehumbert amasses supports this view
for adult phonology as well.
Pierrehumbert restricts her attention to the identication of individual
segments, that is, positionally dened allophones. She notes that phonemes,
as categories of allophones in different positions, play little if any role in adult
speech perception (Pierrehumbert 2003a: 129). But contemporary generative
phonological theory does not refer much to phonemes either; for example,
phonemes are hardly mentioned in a recent survey of theories of phonological
representation (Ewen and van der Hulst 2002). Instead, a more abstract
or general category is used for phonological representation, namely features.
Afeature is a more general category that subsumes multiple segments namely,
all the segments that possess that feature.
Yet features as a more general category are problematic. For example, Ewen
and van der Hulst (2002) argue that the same vowels are categorized in different
ways depending on the relevant phonological process/phonotactic pattern
(pp. 1521, 1025). The vowels in (1), for example, are grouped according to
the category/feature of tenseness:
(1) [+tense] i e a u o
[tense] i
Ewen and van der Hulst argue that this categorization of vowels is needed to
describe a constraint on nal stressed vowels in English (e.g., [+tense] /bi:/ vs.
[tense] */bi/).
A different categorization of the same vowels, given in (2), is necessary
for representing the constraint on possible vowels in a single word (vowel
harmony) in some languages. Vowel harmony in languages such as the
Asante dialect of Akan is governed by the feature of advanced/retracted tongue
root (ATR; Ewen and van der Hulst 2002: 1920):
(2) [+ATR] i e u o
[ATR] i a
Finally, Ewen and van der Hulst (2002) argue that the categorization of
vowels in terms of the traditional feature of height is also necessary in order
to describe, for example, the stepwise shifts in vowel height of the English
Vowel Shift and also a diphthongization process in Skane Swedish (pp. 201;
we have used a multivalued height feature here but most feature theories use
various devices to avoid multivalued features):
(3) [high] i u y
[high-mid] e o
[low-mid]
[low]
Ewen and van der Hulst (2002) introduce three different features for grouping
the same sounds in the three different ways in (1)(3) (they use the single-
valued features ATR [advanced tongue root] and @ [for laxness] and some
combination of features for height: pp. 1025). That is, they have proposed a
distinct vowel feature for each of the three phonological phenomena they
describe. They write:
The range of processes surveyed in this section suggest that vowel systems can be
organized along different phonetic and phonological parameters, and hence that our
feature system must be rich enough to be able to describe all of the parameters found to
play a role in the organization of vowel systems. (Ewen and van der Hulst 2002: 21)
We agree with this statement but we raise the question, where does it stop?
For example, Ewen and van der Hulst (2002) observe in a footnote that with
respect to another English phonotactic phenomenon, occurrence before //,
the category of [tense] vowels must exclude //, and in other respects // acts
as a separate class (p. 18, fn 16). In other words, occurrence before // denes
a different natural class from that in (3), namely {i }. In principle a new
feature should be posited for that class. Otherwise one is in effect choosing
the distribution pattern dened by nal stressed vowels over that dened by
occurrence before // but there is no a priori reason to do so.
The logical conclusion to this process would be the positing of a different
feature for each category dened by each phonotactic constraint. This is in fact
what we are basically arguing for: even the more abstract categories familiar to
us from phonological theory are dened in terms of their position in phonotactic
templates. That is, phonological categories are dened in terms of their distri-
bution in templatic patterns. In other words, the phonotactic templates are basic,
and phonological categories are derivative (we return to this point in
Section 3.2).
A templatic approach to adult phonology is supported by the widespread
and well-known fact that the most general and abstract categories of sounds
(those usually described by features) actually differ in different word or
syllable positions. For example, Bybee (2001) suggests that consonants in
initial and nal position are quite different in phonetic realization (compare
Pierrehumbert 2003a above), that consonant as a category may not be
valid: onsets and codas may not be unied into a single set of consonants
(Bybee 2001: 88). She adds, This proposal would predict that a language
could have a completely mutually exclusive set of syllable onsets and syllable
codas (p. 88).
Although we are not familiar with such a language, some languages have
quite distinct sets of initial and nal consonants with only partial overlap.
Sedang exhibits this pattern for stressed syllables and in addition has a third
series of consonants for initial consonants in an unstressed syllable preceding
the stressed syllable, called a presyllable (Smith 1979: 22, 26, 37), as shown
in Table 2.12.
In addition, there are consonant clusters with stops followed by /l/ or /r/.
The total count of initial vs. nal consonants in Sedang is as given below
(clusters and the presyllabic consonants are excluded from this comparison):
7
(4) Initial: 41 consonants, 30 unique to initial position
Final: 14 consonants, 3 unique to nal position
Overlap: 11 consonants
Smith writes: The dissimilarity of the nal consonant inventory from the initial
single consonant inventory . . . recommends the establishment of a separate
consonantal system for each consonantal position of the phonological word
(Smith 1979: 37). Moreover, the relationship between the syllable nucleus and
the nal consonant is also complex: nal zero and glides allow for register and
oralnasal distinctions in the nucleus, nal nasals allow for register distinctions
only, and other nals allow only oralnasal distinctions (Smith 1979: 424).
This example demonstrates not only that one must distinguish between syllable-
initial and syllable-nal consonants as distinct phonological categories, but
presyllable consonants are a distinct category as well. All three categories of
consonants are dened by their position in the Sedang word template, as
Smith recommends.
The closest example to mutually exclusive positional categories of a highly
general feature that we are aware of is found with the vowels of the
nineteenth-century Tremjugan dialect of Khanty (Abondolo 1998: 362). The
set of word-initial (stressed) vowels of Khanty (called V
1
below) is not the same
as the set of noninitial vowels (V
2
; // and // are back unrounded vowels, // is a
front low unrounded vowel and // a back low rounded vowel; // and / / are
front and back central vowels, respectively):
(5) Initial vowels: ii ee uu oo
e o a
Noninitial vowels: ii ee aa

Table 2.12. Sedang consonant inventories by position
Initial stops p t c k
m
b
n
d

j

g
m n
p
h
t
h
c
h
k
h
b d
m n
m n
Final stops p t k
m n
Presyllabic stops p t k
b
m
Initial continuants s
l r j h
l r
1 r j
Final continuants l~r w j j

h
j h
Presyllabic continuants s
l r j h
V
1
: 13 vowels, 9 unique to initial position
V
2
: 8 vowels, 4 unique to noninitial position
Overlap: 4 vowels
This analysis of Khanty vowels treats long vowels as a separate category (or set
of phonemes) from short vowels. There is good reason to do so; the qualities of
short and long vowels are quite different:
(6) Long (full) vowels: ii ee uu oo aa
Short (reduced) vowels: e o a
VV: 9 vowels, 5 qualities unique to long vowels
V: 8 vowels, 4 qualities unique to short vowels
Overlap: 4 vowels
This is a particularly sharp case where a highly abstract phonological category
differs quite substantially depending on the position of the phones in the
template. But it is a common phenomenon, particularly in comparing stressed
and unstressed vowels or long and short vowels (which are themselves often
phonotactically restricted) and also vowels occurring in more narrowly dened
positions in a word template, such as nal syllables.
In fact, consonant and vowel, to the extent that they are empirically valid
phonological categories, are themselves dened in terms of their position in the
syllable, characterized most broadly as periphery and nucleus respectively. In
this approach, then, what basically differentiates semivowels from vowels
and syllabic consonants from (ordinary) consonants is their position in the
syllable. Of course, the nature of the articulatory gestures is what allows the
sounds to function as either syllable nuclei or syllable peripheries. But that is
merely part of an ultimately phonetic explanation of the phonological patterns
(that is, which sounds occur in which syllable positions).
3.2. Words and templates as the basic units of phonology
All of the examples discussed in Section 3.1 imply that the empirically sup-
ported phonological categories found at all levels of generalization from the
most concrete (tokens of the same segment) to the most abstract (consonant and
vowel) are dened particular to a position in a phonological template, generally
a word template. If categories of segmental phonological units are dened
positionally relative to a word template, then the word template must be the
primary unit of phonological representation, and the individual segment cate-
gory is derived fromit. This is exactly the approach that emerges fromthe cross-
linguistic developmental data examined in Section 2. Although Pierrehumbert
does not take this position explicitly, she does assume that the lexicon is a
central part of the cognitive architecture that is the target of phonological
acquisition (Pierrehumbert 2003a: 116) and she recognizes that the ability to
perceive what she calls prosodic structure, which is basically our notion of
template, must be (and is) acquired very early (Pierrehumbert 2003a: 140).
Bybee explicitly takes the position that the word is the basic unit of phonolog-
ical representation (Bybee 2001: 2931) and that segment categories are
emergent (Bybee 2001: 85).
The child begins with words, and templates are generalizations over the
phonological structure of words (compare Bybee 2001: 8995). The templates
determine the phonological categories of a language, from the most concrete to
the most abstract. The arguments presented in this section imply that as the child
matures to become an adult speaker of her language, the phonological repre-
sentations of individual words and the phonological relations between words do
not change in any essential respect. Adult phonological representations con-
stitute a continuation of child representations. In the words of Ferguson and
Farwell (1975: 437), we assume that a phonic core of remembered lexical items
and articulations which produce them is the foundation of an individuals
phonology . . . Thus we assume the primacy of lexical learning in phonological
development . . . (emphasis ours); (see also Beckman and Edwards 2000b).
The adult templates are both more general and more varied than those of the
child, but this is a difference in degree, not kind.
The exemplar and usage-based models propose that individual usage events
play a role in adult phonological representation. Exemplar approaches to word
recognition appear to provide a plausible model for the implicit emergence of
phonological structure from repeated memory traces (Goldinger 1996, 1998;
Pierrehumbert 2001). The basic idea is that memory traces of new experiences,
including speech input, are laid down with each exposure. These traces retain
detail (e.g., regarding speakers voice characteristics and also context) over a
period of time; retention is longer in tasks drawing on implicit memory than in
explicit recall. As children listen to adult words in the period of rst word
production, the input sequences represented in the greatest detail should be
those that automatically activate similar motor plans from the childs own vocal
production repertoire. These sequences may also be retained as traces of often
repeated babbling in the childs own voice. Note that the effects of existing
patterns will necessarily be strongest at the outset of identiable word produc-
tion. Computer modeling shows that abstraction is the automatic consequence
of aggregate activation of high-frequency tokens, with regression toward cen-
tral tendencies as numbers of highly similar exemplars accumulate: the single
voice advantage diminishes as word frequencies increase. Old High Frequency
words inspire abstract echoes, obscuring context and voice elements of the
study trace (Goldinger 1998: 255). The appropriate size of the phonological
exemplar is a word, because a word is a unit of usage that is both phonolog-
ically and pragmatically appropriate in isolation (Bybee 2001: 30) that is, the
smallest linguistic unit encountered in language use.
Frequency plays a signicant role in the representation of phonological knowl-
edge of adults as well as children learning language. Experimental work with
adults, using nonword stimuli, has shown that language users are highly sensitive
to the phonotactic regularities implicit in the lexicon (Vitevich, Luce,
Charles-Luce, and Kemmerer 1997; Vitevich and Luce 1998, 1999; Frisch 2000;
Frisch, Large, and Pisoni 2000; Frisch and Zawaydeh 2001; Treiman 2000;
Bailey and Hahn 2001; see also Pierrehumbert 2003b). Bybee (2001) surveys
diachronic and typological as well as experimental evidence demonstrating the
role of token and type frequency in phonological organization and processes.
Edwards, Beckman, and Munson (2004) have demonstrated such lexical fre-
quency effects in children, the strength of existing patterns being inversely
correlated with vocabulary size. They argue that children develop an implicit
phonological grammar out of the words they learn holistically (p. 422). The
phonological grammar so derived permits access to sublexical patterns in both
perception and production. Those patterns include both typical acoustic frag-
ments and abstract phonological categories (phoneme sequences), and access is
facilitated by both auditory and articulatory experience with words.
It should be noted that much current research in phonological theory, as
surveyed in Ewen and van der Hulst (2002), goes in the opposite direction to the
approach discussed here, by attempting to simplify and further generalize
abstract phonological structures. But the reality of the complex variation in
phonological patterns leads to a proliferation of theoretical constructs to deal
with violations of the constraints imposed by the highly general and simple
structures. The set of phonological features has been simplied through the
postulation of such principles as binarity, under-specication and single-valued
features (Ewen and van der Hulst 2002: 54, 6385). But theorists have con-
sequently been required to posit constructs such as redundancy constraints,
default rules, the Redundancy Rule Ordering Constraint, dependency and
particles (Ewen and van der Hulst 2002: 668, 757, 912, 1025). The
inventory of syllable structures has been simplied through the postulation of
the sonority sequencing generalization and the hypothesis that all syllable
structures are binary branching (Ewen and van der Hulst 2002: 136, 175).
Again, this has required the positing of constructs such as syllable prependices
and appendices, extrasyllabic segments, empty syllable positions, and licensing
and government relations between segments in syllables (Ewen and van der
Hulst 2002: 1369, 14750, 165, 17493). Finally, the inventory of metrical
feet has been simplied by various principles, in particular the principle that all
feet are binary (Ewen and van der Hulst 2002: 226). Again, this has required the
positing of constructs such as monosyllabic feet, degenerate feet, weak local
parsing, extrametricality and footless languages (Ewen and van der Hulst 2002:
226, 22837). In our view, these additional theoretical constructs are ad hoc,
and their proliferation strongly suggests that this sort of simplication in
representation does not lead to natural empirical generalizations. In contrast,
the only phonological categories posited by a templatic approach to phonology
are (i) words; (ii) word templates of varying degrees of schematicity; and (iii)
syllable and segment categories as subparts of those phonological templates,
dened in terms of their occurrence in particular template positions. This is a
formally simple model, utilizing a minimum of theoretical constructs.
The templatic approach to phonology is further supported by nonlinear
representations (van der Hulst and Smith 1982; Goldsmith 1990).
Phonological properties or features are not specically bound to particular
segment positions in a word: they can be restricted to a single segment position
or extended over multiple positions (which may be limited to consonantal slots
only or vocalic slots only). This hypothesis about the mapping of phonological
properties onto skeletal positions has been formalized by representing each
feature on its own tier (Ewen and van der Hulst 2002: 414). Articulatory
phonology (Browman and Goldstein 1989, 1991, 1992; see also Bybee 2001:
6977) takes this trend to its logical conclusion. Articulatory phonology is a
directly phonetically based nonlinear model, in which the articulatory gestures
are the basic phonological features, and the nonlinear mapping of gestures is
the result of the complex motor coordination of the gestures to produce a word.
The execution and coordination of articulatory gestures are the source of most
phonological processes. Nonlinear models take inspiration from Firths (1957)
prosodic approach to phonology. Firth uses the metaphor of a musical score to
describe his prosodic representations (pp. 1378), very similar to the tiers of
contemporary nonlinear models and specically the articulatory score of
Browman and Goldstein.
Firth emphasizes a further point about nonlinear models which links them to
a templatic approach to phonology. If features are not simply mapped onto
segment positions, then the basic unit of phonological structure is the domain of
the complex mapping of features, i.e., the word, or even a larger unit (Firth
1957: 121). A nonlinear model must represent a larger unit than a single seg-
ment, because the mapping between tiers spreads across segments. In fact, the
domain of the mapping is more basic than the individual segments in the
skeleton of a word, because the assignment of features to a segmental position
in the skeleton is determined by the mapping. Thus nonlinear phonology has
already moved away from segments to larger units as the basic units of analysis.
A templatic phonology brings this tendency to its logical conclusion by treating
the word as the basic unit of phonological representation.
Our templatic approach to phonological representation is centrally concerned
with a redenition of phonological categories of segments in words according to
their phonotactic position as dened by syllable and word structure. This mirrors
a constructional approach to syntactic representation, in particular Radical
Construction Grammar (Croft 2001). Croft argues that the variation in syntactic
category membership and denition within and across languages requires that
they be dened ultimately in terms of their position or role in the syntactic
constructions used to dene them. It is described as radical in order to empha-
size that the constructions are basic and the syntactic categories of particular units
are derived from the constructions. In this respect our templatic approach to
phonology is also radical. Radical Construction Grammar also adopts the
denition of categories used by Pierrehumbert, as a level of discrete categories
mapped onto a density distribution of individual functions or meanings, the
conceptual space parallel to the space dened by phonetic parameters. This model
of categories is known as the semantic map model in typological theory
(Haspelmath 2003; Croft 2003; Croft and Poole 2008). In this respect the radical
templatic model of phonological representation is conceptually the same as the
radical constructional model of syntactic representation.
We conclude by responding to an objection to an exemplar-based model such
as that advocated here. It appears that an exemplar-based model presupposes the
very categories that it denes by its exemplars. How does the speaker know that
the various exemplars of /p/ or // in different words are instances of the same
phonological category, and not exemplars of phonetically neighboring catego-
ries in the phonetic parameter space? For example, Labovs research on a single
individuals productions of vowel tokens (Labov 1994, inter alia) demonstrates
that individual exemplars of one phoneme will be included in the phonetic range
of another phoneme: for example, some exemplars of // will occur in the range
of exemplars of //. Howdoes a speaker knowthat those tokens are exemplars of
// and not //? This question cannot be answered in a purely segment-based
approach to phonological representation. If one begins with segments, one must
have a denition of those segments that is either ultimately phonetic, or else
purely arbitrary (i.e., a particular exemplar is stipulated to be an exemplar of //
even if its actual realization is [] in purely phonetic terms).
On the other hand, if one begins with words as phonological units, then the
question can be answered and the paradox is solved. The phonetically outlying
token is an exemplar of // because it is part of a specic word, and other
occurrences of that word contain exemplars that cluster around the central
phonetic tendency for //. How is the word identied as the same word? The
word is of course identied as the same by its meaning in the context of use,
linked to prior occurrences of the word with that meaning in similar contexts of
use. In other words, we return to the starting point of our perspective on
phonology: phonology, like other aspects of language, must begin from the
soundmeaning link that is central to the symbolic nature of language.
notes
1. The term template has been usesd in generative phonology in reference to analyses
in which xed prosodic structures (syllabic and metrical) have been posited to
account for patterns in which segmental material appears to be matched or tted
into such templates (see, for example, the analyses summarized in Kenstowicz [1994:
2704, 6225]; see also McCarthy and Prince [1988, 1990]). Our use of the term
follows the usage in phonological development: it is more general, in that it describes
word-sized patterns at all levels of phonological organization, and is not restricted to
template-matching or template-tting processes.
2. Larger structures, namely constructions, are also symbolic units. Constructions may
have distinctive phonological properties, specically prosodic properties. However,
these are beyond the scope of this article, which limits itself to segmental phonolog-
ical representations.
3. See Vihman (1996) for a review of the long-standing debate regarding the role of
perception in word production errors.
4. Note that we disregard changes in voicing in all of the developmental analyses:
voicing is not generally thought to be under voluntary control at this age, nor is
transcription of voicing in child production reliable without acoustic verication.
See Macken (1980) for an overview of the acquisition of voicing contrasts.
5. The velar stop /k/ was produced as [k] only before the (whispered) back vowel [] at
this stage; it was fronted to [t] before front vowels (see Vihman 1976).
6. Note that we are disregarding initial glottal stop, which is notoriously difcult to
transcribe reliably (Vihman et al. 1985). Examples of no onset can be found in
Tables 2.4 (P: initial fricatives omitted), 2.5 (Madli: initial /k/ and /s/), and 2.8
(Eeriku: initial /k/, /h/, and /v/).
7. /l/ and /r/ are treated as distinct in initial position but as variants in nal position;
Smith does not describe the nature of the nal liquid variation. We treat both /l/ and /r/
as occurring in both initial and nal position.
References
Abondolo, D. (1998). Khanty. In D. Abondolo (ed.), The Uralic languages, pp. 35886.
London: Routledge.
Bailey, T. M. and Hahn, U. (2001). Determination of wordlikeness: phonotactics or
lexical neighborhoods? Journal of Memory and Language, 44, 56891.
Beckman, M. E. and Edwards, J. (2000a). Lexical frequency effects on young childrens
imitative productions. In M. B. Broe and J. B. Pierrehumbert (eds.), Papers in
laboratory phonology V: Acquisition and the lexicon, pp. 20818. Cambridge
University Press.
(2000b). The ontogeny of phonological categories and the primacy of lexical learning
in linguistic development. Child Development, 71(1), 2409.
Berg, T. and Schade, U. (2000). A local connectionist account of consonant harmony in
child language. Cognitive Science, 24(1), 12349.
Berman, R. A. (1977). Natural phonological processes at the one-word stage. Lingua, 43,
121.
Bhaya Nair, R. (1991). Monosyllabic English or disyllabic Hindi? Indian Linguistics, 52,
5190.
Bowerman, M. and Choi, S. (2001). Shaping meanings for language: universal and
language-specic in the acquisition of spatial semantic categories. In
M. Bowerman, and S. C. Levinson (eds.), Language acquisition and conceptual
development, pp. 475511. Cambridge University Press.
investigation of vowel formants in babbling. Journal of Child Language, 16(1), 117.
Boysson-Bardies, B. and Vihman, M. M. (1991). Adaptation to language: evidence from
babbling and rst words in four languages. Language, 67(2), 297319.
Browman, C. P. and Goldstein, L. (1989). Articulatory gestures as phonological units.
Phonology, 6(2), 20151.
(1991). Gestural structures: distinctiveness, phonological processes and historical
change. In I. G. Mattingly and M. Studdert-Kennedy (eds.), Modularity and the
motor theory of speech perception, pp. 31338. Hillsdale, NJ: Lawrence Erlbaum.
(1992). Articulatory phonology: an overview. Phonetica, 49, 15580.
Brulard, I. and Carr, P. (2003). FrenchEnglish bilingual acquisition of phonology: one
production system or two? International Journal of Bilingualism, 7(2), 177202.
Bybee, J. L. (2001). Phonology and language use. Cambridge Univesity Press.
Chiat, S. (1979). The role of the word in phonological development. Linguistics, 17,
491610.
Clark, H. H. (1996). Using language. Cambridge University Press.
Croft, W. (2000). Explaining language change: an evolutionary approach. Harlow:
Longman.
(2001). Radical construction grammar: syntactic theory in typological perspective.
(2003). Typology and universals, 2nd edn. Cambridge University Press.
Croft, W. and Poole, K. T. (2008). Multidimensional scaling and other techniques for
uncovering universals [response to commentaries]. Theoretical Linguistics, 34, 7584.
Davis, B. L., and MacNeilage, P. F. (1990). Acquisition of correct vowel production: a
quantitative case study. Journal of Speech and Hearing Research, 33, 1627.
(1995). The articulatory basis of babbling. Journal of Speech and Hearing Research,
38, 11991211.
(2000). An embodiment perspective on the acquisition of speech perception.
Phonetica, 57, 22941.
(2002). Acquisition of serial complexity in speech production: a comparison of
phonetic and phonological approaches to rst word production. Phonetica, 59,
75107.
DePaolis, R. A. (2006). The inuence of production on the perception of speech. In
D. Bamman, T. Magnitskaia, and C. Zaller (eds.), Proceedings of the 30th Boston
University Conference on Language Development, pp. 14253. Somerville, MA:
Cascadilla Press.
DOdorico, L., Carubbi, S., Salerni, N., and Calvo, V. (2001). Vocabulary development
in Italian children; a longitudinal evaluation of quantitative and qualitative aspects.
Journal of Child Language 28(3), 35172.
Donahue, M. L. (1986). Phonological constraints on the emergence of two-word
utterances. Journal of Child Language 13(2), 20918.
Edwards, J., Beckman, M. E., and Munson, B. (2004). The interaction between vocabu-
lary size and phonotactic probability effects on childrens production accuracy and
uency in nonword repetition. Journal of Speech, Language, and Hearing
Research, 47, 42136.
Elbers, L. and Ton, J. (1985). Play pen monologues: the interplay of words and babble in
the rst words period. Journal of Child Language, 12(3), 55165.
Elbers, L. and Wijnen, F. (1992). Effort, production skill, and language learning. In
Elsen, H. (1996). Two routes to language: stylistic variations in one child. First
Language, 16(2), 14158.
Ewen, C. J. and Hulst, H. van der (2002). The phonological structure of words: an
introduction. Cambridge University Press.
ition. Language, 51(2), 41939. Reprinted in this volume as Chapter 4.
Ferguson, C., Peizer, D. B., and Weeks, T. E. (1973). Model-and-replica phonological
grammar of a childs rst words. Lingua, 31(1), 3565.
Firth, J. R. (1957). Sounds and prosodies. Papers in linguistics, 19341951, 12138.
Francescato, G. (1968). On the role of the word in rst language acquisition. Lingua, 21,
14453.
French, A. (1989). The systematic acquisition of word forms by a child during the rst-
fty-word stage. Journal of Child Language, 16(1), 6990.
Frisch, S. A. (2000). Temporally organized lexical representations as phonological units.
In M. B. Broe and J. B. Pierrehumbert (eds.), Papers in Laboratory Phonology V:
Acquisition and the lexicon, pp. 28398. Cambridge University Press.
Frisch, S. A., Large, N. R., and Pisoni, D. B. (2000). Perception of wordlikeness: effects
of segment probability and length on the processing of nonwords. Journal of
Memory and Language, 42, 48296.
Frisch, S. A. and Zawaydeh, B. A. (2001). The psychological reality of OCP-place in
Arabic. Language, 77, 91106.
Goldinger, S. D. (1996). Words and voices: episodic traces in spoken word identication
and recognition memory. Journal of Experimental Psychology: Learning, Memory,
and Cognition, 22(5), 116683.
(1998). Echoes of echoes? An episodic theory of lexical access. Psychological Review,
105, 25179.
Goldsmith, J. A. (1990). Autosegmental and metrical phonology. Oxford: Blackwell.
Grunwell, P. (1982). Clinical phonology. London: Croom Helm.
Haspelmath, M. (2003). The geometry of grammatical meaning: semantic maps and
cross-linguistic comparison. In M. Tomasello (ed.), The new psychology of lan-
guage, vol. 2, pp. 21142. Mahwah, NJ: Lawrence Erlbaum.
Hulst, H. van der and Smith, N. V. (1982). An overview of autosegmental and metrical
phonology. In H. van der Hulst and N. Smith (eds.), The structure of phonological
representations, vol. 1, pp. 145. Dordrecht: Foris.
Jaeger, J. J. (1997). How to say Grandma: the problem of developing phonological
representations. First Language, 17(1), 129.
Jakobson, R. (1968). Child language, aphasia, and phonological universals, trans.
A.R. Keiler. The Hague: Mouton. (Originally published as Kindersprache,
Aphasie und allgemeine Lautgesetze. Uppsala: Almqvist & Wiksell, 1941.)
Jusczyk, P. W. (1992). Developing phonological categories from the speech signal. In
(1997). The discovery of spoken language. Cambridge, MA: MIT Press.
Keller, R. (1994). On language change: the invisible hand in language. London:
Routledge.
Kenstowicz, M. (1994). Phonology in generative grammar. Oxford: Blackwell.
Kent, R. D. (1992). The biology of phonological development. In C. A. Ferguson,
L. Menn, and C. Stoel-Gammon (eds.), Phonological development: models,
research implications, pp. 6590. Timonium, MD: York Press.
Kent, R. D. and Bauer, H. R. (1985). Vocalizations of one-year olds. Journal of Child
Language, 13(3), 491526.
Krgvee, K. (2001). Lapse snavara areng vanuses 1;82.1 [A childs lexical develop-
ment, aged 1;32;1]. Unpublished undergraduate thesis, Tartu University.
Kunnari, S. (2000). Characteristics of early lexical and phonological development in
children acquiring Finnish (Acta Universitatis Ouluensis B 34 Humaniora). Oulu:
Oulu University Press.
Labov, W. (1994). Principles of linguistic change, vol. 1: Internal factors. Oxford:
Blackwell.
Labov, W. and Labov, T. (1978). The phonetics of cat and mama. Language, 54(4),
81652.
Ladefoged, P. and Maddieson, I. (1996). The sounds of the worlds languages. Oxford:
Blackwell.
Leopold, W. F. (1939). Speech development of a bilingual child, vol. 1: Vocabulary
growth in the rst two years. Evanston, IL: Northwestern University Press.
Lieven, E. V. M., Theakston, A. L., Pine, J. M., and Rowland, C. F. (2000). The use and
non-use of auxiliary be. In E. Clark (ed.),The proceedings of the Thirtieth Annual
Child Language Research Forum, pp. 5158. Cambridge University Press.
Locke, J. L. (1983). Phonological acquisition and change. New York: Academic Press.
Locke, J. and Pearson, D. M. (1992). Vocal learning and the emergence of phonological
capacity: a neurobiological approach. In C. A. Ferguson, L. Menn, and C. Stoel-
Gammon (eds.), Phonological development: models, research, implications,
pp. 91129. Timonium, MD: York Press.
Macken, M. A. (1978). Permitted complexity in phonological development: one childs
acquisition of Spanish consonants. Lingua, 44, 21953.
(1979). Developmental reorganization of phonology: a hierarchy of basic units of
acquisition. Lingua, 49, 1149.
(1980). Aspects of the acquisition of stop systems: a cross-linguistic perspective. In
G. Yeni-Komshian, J. F. Kavanagh, and C. A. Ferguson (eds.), Child phonology,
vol. 1: Production, pp. 14368. New York: Academic Press.
(1992). Wheres phonology? In C. A. Ferguson, L. Menn, and C. Stoel-Gammon
(eds.), Phonological development: models, research, implications, 24969.
Timonium, MD: York Press.
(1995). Phonological acquisition. In J. Goldsmith (ed.), The handbook of phonolog-
ical theory, pp. 67196. Cambridge, MA: Blackwell.
(1996). Prosodic constraints on features. In B. Bernhardt, J. Gilbert, and D. Ingram
(eds.), Proceedings of the UBC International Conference on Phonological
Acquisition, pp. 15972. Somerville, MA: Cascadilla Press.
McCarthy, J. and Prince, A. (1988). Quantitative transfer in reduplicative and templatic
morphology. Linguistics in the Morning Calm, 2, 335.
(1990). Foot and word in prosodic morphology: the Arabic broken plural. Natural
Language and Linguistic Theory, 8, 20984.
McCune, L. and Vihman, M. M. (2001). Early phonetic and lexical development: a
productivity approach. Journal of Speech, Language and Hearing Research, 44,
67084.
Matthei, E. (1989). Crossing boundaries: more evidence for phonological constraints on
early multi-word utterances. Journal of Child Language, 16(1), 4154.
Menn, L. (1971). Phonotactic rules in beginning speech: a study in the development of
English discourse. Lingua, 26, 22551.
(1983). Development of articulatory, phonetic, and phonological capabilities. In
Menn, L. and Matthei, E. (1992). The two-lexicon account of child phonology:
looking back, looking ahead. In C. A. Ferguson, L. Menn, and C. Stoel-Gammon
(eds.), Phonological development: models, research, implications, pp. 21147.
Ohala, J. (1993). The phonetics of sound change. In C. Jones (ed.), Historical linguistics:
Problems and Perspectives, pp. 23778. London: Longman.
Oller, D. K. (1980). The emergence of the sounds of speech in infancy. In
G. Yeni-Komshian, J. F. Kavanagh, and C. A. Ferguson (eds.), Child phonology,
vol. 1: Production, pp. 93112. New York: Academic Press.
(2000). The emergence of the speech capacity. Mahwah, NJ: Lawrence Erlbaum.
Pierrehumbert, J. (2001). Exemplar dynamics: word frequency, lenition and contrast. In
J. L. Bybee, and P. Hopper (eds.), Frequency and emergence in grammar,
(2003a). Phonetic diversity, statistical learning, and acquisition of phonology.
Language and Speech, 46(2/3), 11554.
(2003b). Probabilistic theories of phonology. In R. Bod, J. B. Hay, and S. Jannedy
(eds.), Probability theory in linguistics, pp. 177228. Cambridge, MA: MIT Press.
Pierrehumbert, J., Beckman, M. E., and Ladd, D. R. (2000). Conceptual foundations of
phonology as a laboratory science. In N. Burton-Roberts, P. Carr, and G. Docherty
(eds.), Phonological knowledge: conceptual and empirical issues, pp. 273304.
Reichling, A. J. B. N. (1935). Het word; een studie omtrent de grondslag van taal en
taalgebruik. Nijmegen. Reprinted Zwolle, 1967.
Salo, A. (1993). Muutelppude ilmumine he eesti lapse keelde vanuses 1;52;5 [The
emergence of inectional endings in the language of one Estonian child aged 1;5
2;5]. Undergraduate thesis, Finno-Ugric Languages Department, Tartu University.
Savinainen-Makkonen, T. (2000). Word initial consonant omissions a developmental
process in children learning Finnish. First Language, 20(2), 16185.
(2001). Suomalainen lapsi fonologiaa omaksumassa [Finnish children acquiring
phonology]. Publications of the Department of Phonetics 42. Helsinki:
Department of Phonetics, University of Helsinki.
Schwartz, R. G. (1988). Phonological factors in early lexical acquisition. In M. D. Smith
and J. L. Locke (eds.), The emergent lexicon: the childs development of a linguistic
vocabulary, pp. 185222. New York: Academic Press.
Smith, K. D. (1979). Sedang grammar (Pacic Linguistics B-50). Honolulu: University
of Hawaii Press.
Stager, C. L. and Werker, J. F. (1997). Infants listen for more phonetic detail in speech
perception than in word-learning tasks. Nature, 388, 3812.
Stemberger, J. P. (1988). Between-word processes in child phonology. Journal of Child
Language, 15(1), 3962.
Stoel-Gammon, C. and Cooper, J. A. (1984). Patterns of early lexical and phonological
development. Journal of Child Language, 11(2), 24771.
Studdert-Kennedy, M. and Goodell, E. W. (1995). Gestures, features and segments in
early child speech. In B. de Gelder and J. Morais (eds.), Speech and reading,
pp. 6588. Hove: Lawrence Erlbaum.
Tomasello, M. (1992). First verbs. Cambridge University Press.
Treiman, R., Kessler, B., Knewasser, S., Tincoff, R., and Bowman, M. (2000). English
speakers sensitivity to phonotactic patterns. In M. Broe and J. B. Pierrehumbert
(eds.), Papers in laboratory phonology V: Acquisition and the lexicon, pp. 26982.
Vihman, M. M. (1976). From prespeech to speech: on early phonology. Papers and
Reports on Child Language Development, 12, 23044.
(1978). Consonant harmony: its scope and function in child language. In
J. H. Greenberg (ed.), Universals of human language, pp. 281334. Stanford
University Press.
(1981). Phonology and the development of the lexicon: evidence from childrens
errors. Journal of Child Language, 8(2), 23964.
(1992). Early syllables and the construction of phonology. In C. A. Ferguson,
research, implications, pp. 393422. Timonium, MD: York Press.
(1993). Variable paths to early word production. Journal of Phonetics, 21(1/2), 6182.
(1996). Phonological development: the origins of language in the child. Oxford:
Blackwell.
(2002a). Getting started without a system: from phonetics to phonology in bilingual
development. International Journal of Bilingualism, 6(3), 23954.
(2002b). The role of mirror neurons in the ontogeny of speech. In M. Stamenov and
V. Gallese (eds.), Mirror neurons and the evolution of brain and language,
Vihman, M. M. and DePaolis, R. A. (2000). Prosodic development: a cross-linguistic
analysis of the rst word period. End of award report, Economic and Social
Research Council Award R000237087.
Vihman, M. M., DePaolis, R. A., and Davis, B. L. (1998). Is there a trochaic bias in
early word learning? Evidence fromEnglish and French. Child Development, 69(4),
93347.
Vihman, M. M., Kay, E., Boysson-Bardies, B. de, Durand, C., and Sundberg, U. (1994a).
External sources of individual differences? A cross-linguistic analysis of the pho-
netics of mothers speech to one-year-old children. Developmental Psychology, 30
(5), 65263.
Vihman, M. M., Macken, M. A., Miller, R., Simmons, H., and Miller, J. (1985).
From babbling to speech: a re-assessment of the continuity issue. Language, 61(2),
397445.
Vihman, M. M. and Miller, R. (1988). Words and babble at the threshold of lexical
acquisition. In M. D. Smith and J. L. Locke (eds.), The emergent lexicon: the childs
development of a linguistic vocabulary, pp. 15183. New York: Academic Press.
Vihman, M. M. and Nakai, S. (2003). Experimental evidence for an effect of vocal
experience on infant speech perception. In M. J. Sol, D. Recasens, and
J. Romero (eds.), Proceedings of the 15th International Congress of Phonetic
Sciences, Barcelona, pp. 101720. Barcelona: Universitat Autnoma de Barcelona.
Vihman, M. M., Nakai, S., and DePaolis, R. A. (2006). Getting the rhythm right: a cross-
linguistic study of segmental duration in babbling and rst words. In L. Goldstein,
K. Best, D. Whalen, and S. Anderson (eds.), Papers in laboratory phonology VIII:
Varieties of phonological competence, pp. 34166. Cambridge University Press.
Vihman, M. M. and Velleman, S. L. (1989). Phonological reorganization: a case study.
Language and Speech, 32(2), 14970. Reprinted in this volume as Chapter 8.
(2000). The construction of a rst phonology. Phonetica, 57(24), 25566.
Vihman, M. M., Velleman, S. L., and McCune, L. (1994b). How abstract is child
phonology? In M. Yavas (ed.), First and second language phonology, pp. 944.
San Diego, CA: Singular Publishing Group. Reprinted in this volume as Chapter 9.
Vitevich, M. S., and Luce, P. A. (1998). When words compete: levels of processing in the
perception of spoken words. Psychological Science, 9, 3259.
(1999). Probabilistic phonotactics and neighborhood activation in spoken word rec-
ognition. Journal of Memory and Language, 40, 374408.
Vitevich, M. S., Luce, P. A., Charles-Luce, J., and Kemmerer, D. (1997). Phonotactics
and syllable stress: implications for the processing of spoken nonsense words.
Language and Speech, 40, 4762.
(1987). Prosodic phonology: the theory and its application to language acquisition
and speech processing. Newcastle upon Tyne: Grevatt & Grevatt.
Weeks, T. E. (1974). The slow speech development of a bright child. Lexington, MA:
Lexington Books.
Werker, J. F., Fennell, C. T., Corcoran, K. M., and Stager, C. L. (2002). Infants ability
to learn phonetically similar words: effects of age and vocabulary size. Infancy,
3(1), 130.
Part II
Setting papers
3 Child phonology: a prosodic view
Natalie Waterson
During the past few years linguists have begun to express doubts about the
validity of segmental analysis for the study of child language and have come to
feel that it is probable that a child perceives spoken language differently from
an adult, e.g., Bellugi and Brown (1964: 113), Ingram (1966: 218), Ladefoged
(1967: 1489), Lenneberg (1967: 27981), Weir (1962: 30). They are also
beginning to realize the importance of treating a childs language as having its
own independent system (Carroll 1961: 332; Fry 1966: 194). To date, however,
investigators of child speech have made their analyses on a segmental and
distributional basis and have expressed the childs phonological system in
terms of the adults phonemic system; one may cite for example Cohen
(1969), Grgoire (1933, 1947), Jakobson (1941/1968), Jakobson and Halle
(1961), Leopold (1939, 1947, 1961), Lewis (1968), Ohnesorg (1959), Velten
(1943), and Weir (1962). Such studies, valuable though they be, cannot be said
to have succeeded in explaining the relationship between the forms and struc-
tures of child and adult and have left many questions unanswered such as:
(1) What governs the choice of sound that the child will use as a substitute for
an adult sound?
(2) Why does the child drop certain sounds of the adult form or substitute
for them when he is already capable of making such sounds and is in fact
using them in some other contexts?
(3) Why does the child use homonyms for adult forms which appear to be quite
unlike each other and which have been proved to be semantically clearly
differentiated for the child?
(4) What governs the form that reduplicated words take?
This chapter offers an approach which provides new insights into the
relationship between a childs phonetic forms and phonological structures
and those of the adult system through the use of a nonsegmental analysis
which gives greater freedom to express various correlations between child
and adult forms and structures. What is suggested is an articulatory feature
I am indebted to my colleagues Professor E. J. A. Henderson, Professor R. H. Robins, Professor
C. E. Bazell, Dr. T. Bynon, and Dr. E. Dunstan who read the draft of this paper and made some very
helpful criticisms, and am especially grateful to Mrs. E. M. Whitley with whom I have had several
hours of extremely fruitful discussion on the subject.
61
analysis for the phonetic description and prosodie analysis for the phonology.
The features used are those that arise from the material under investigation,
i.e., those required to describe the particular forms of the child and adult at
the time the child was approximately 18 months old. They are not the distinc-
tive features of generative phonology and are not intended to be considered
universal. Current phonetic terminology and systems of transcription are
not geared to nonsegmental description so that the terminology used in the
phonetic account is sometimes clumsy and terms are not always used in a
familiar way.
The aim of this chapter is to show by means of a nonsegmental type of
analysis that a childs language has its own independent system which, though
different from the adult system, is closely related to it even where the childs
forms appear to be quite unlike the adults. The analysis proposed here, by
demonstrating the relationship between child and adult forms and structures,
makes it possible to suggest solutions to the questions listed above. The chapter
does not deal with the sequence in which a child acquires the different sounds
of a language. This aspect of language acquisition has already been widely
discussed by Jakobson and others (Jakobson 1941/1968; Jakobson and Halle
1961; Leopold 1939, 1947, 1961; Lewis 1968; Ohnesorg 1959; Velten 1943).
The study is based on the writers eldest child, P. Daily records were kept but
the material was not analyzed until some years after the collection of data was
complete. As a childs linguistic development is individual, being conditioned
by his particular environment, the phonetic and phonological descriptions given
here are applicable only to the speech of the child P. However, it is possible
that the way he acquired his speech forms is similar to that of other children and
thus the ndings may have a more general application. A brief outline of Ps
environment is now given to put the individual aspect of his acquisition of
speech into perspective.
P, when a baby, was spoken to a great deal and was given plenty of attention
at times when he wanted it. He was not left to cry or to lie awake inactive for
long periods in his cot or pram with nothing to see or do. In his waking hours
he was often put on the oor indoors or the ground out-of-doors to look around
and engage in exercising his limbs, and was given toys and other material to
investigate. He started walking holding-on at 6 months, and unaided at 9 months.
His interest in objects and activity around him was always encouraged. Nursery
rhymes were often sung to him and this obviously had relevance to his language
learning as he began to recite nursery rhymes at the early age of 1;8 (one
year, eight months). He had an excellent memory from an early age and a very
well-developed sense of humor. He showed a great interest in the household pets,
i.e., Bob / Bobby the dog, the two cats referred to as kitties and the two
goats, Rooney and Anne. His vocabulary can be seen to be that of names of
humans, animals, and objects in his close environment, of names of objects in
picture-books, and of actions concerned with the daily life and activities of the
family.
62 Natalie Waterson
A child is constantly exposed to something of the phonological system of
the adult. To begin with, the sort of language that is used to a baby is restricted
mainly to that associated with a babys routine. The routine is carried out several
times in twenty-four hours and this means that the baby hears frequent repeti-
tions of the same sort of vocabulary and therefore many of the same combina-
tions of sound features, and of course the same sort of sentence structures in the
same sort of contexts for several months: as Fry pointed out (1966: 188), a child
needs to hear speech in context in order for it to be meaningful. Later when he
attempts to speak, he gets encouragement, correction and reinforcement from
those around him and this is usually in the form of a repetition of the whole
utterance, i.e., nonsegmental.
Deviances from the norm in the speech of adults, whether phonological or
grammatical, generally do not occur frequently enough in the same form for a
child to pick them up and remember them. One may expect that he would be
far more likely to pick up the forms that keep coming up regularly and occur in
the same sort of situation, and it therefore seems reasonable to consider that
most forms used by a child are his own creations made on the basis of regular,
nondeviant adult forms. From observation of the child P, it seems probable that
when beginning to operate language as opposed to simple repetition or imi-
tation, a child perceives only certain of the features of the adult utterances and
reproduces only those that he is able to cope with. (A fuller discussion of this
point may be found in Waterson 1970.) This results in his forms sometimes
appearing to be quite unlike those of the adult. As his input increases and his
experience grows, so he is able to perceive and reproduce more, and his
phonological system develops closer and closer to the adult system.
It was found that some of Ps early forms seemed so different from the
corresponding adult forms as to appear to have no relationship to them at all, but
they are known to be the same by their function in context (see p. 64). Examined
segmentally, such child forms show very little congruence with the adult forms,
e.g., comparing some of Ps words with initial [] and the corresponding adult
forms which have been established as having the same meaning by their function
in context, no meaningful correlation can be shown. The adult forms with which
the child forms are compared throughout the chapter are those of his mother unless
stated otherwise.
Child forms Adult forms
nger [:/i:] (i.e., two forms were in use) [f g]
window [e:e:] [w
nd u]
1
another [aa] [n]
Randall [a
] [r nd ]
Randall is the name of a friend and neighbor who helped to looked after P
and who was always addressed and referred to as Mrs. Randall until P started
to call her [a
], after this she sometimes referred to herself as [nn] but not

before so that Ps use of [a
] arose from the form [r nd ] and not [nn].

Child phonology: a prosodic view 63
In the comparison, adults initial [f, w, , r ,] appear to correspond to the
childs initial [], and adults medial [] or [g], [n] or [nd], and [n] or [] appear
to correspond to the childs medial []. Analyzed in terms of features, as in the
pages that follow, it is possible to show a much clearer relationship between
these forms.
For the purpose of this chapter a selection had to be made from Ps vocabu-
lary, and examples were chosen mainly from the period 1;6 because at the
time he was regularly using many forms of words which seemed quite unlike
the corresponding adult forms. At this time, his vocabulary consisted of
approximately 155 words; about 104 monosyllabic, 48 disyllabic and 3 or so
trisyllabic. Of the 48 disyllabic forms, 36 were of the CVCVstructure. The child
was already using threefour-word sentences, e.g., [dada ti::gn] daddys teas
all gone; [ba:gn ga:tn] Bobs gone in the garden; [m ge:k dada] more cake,
daddy. It is not possible to give exact gures for the number of words either
monosyllabic or disyllabic, nor the length of sentences in number of words
without entering into discussions of whether the childs forms such as do it, sit
down, all gone, etc., are one word or two.
The aim was to see what correlations could be established at the phonetic
and phonological levels and to try to determine how much of the adult form
the child was reproducing. The childs words selected for study were therefore
chosen for their phonetic form, the only relevance of their meaning is that
through semantic correlation child and adult forms can be identied as being
the same word. The semantic correlation was established by the function of the
word in context, i.e., the words were regularly used by the child in the same
contexts and with reference to the same objects or actions as the adult word so
used, and the adults reaction in response to the childs usage produced sat-
isfaction for the child, i.e., it produced the desired result. The child made it quite
plain in various ways if he was not understood in the way he intended. As a
simple example one may take one of the childs forms for y, viz. []. The
childs father sometimes amused the child by pretending to hunt ies that came
into the kitchen by pointing at them and trying to chase them out of the window,
saying, Fly! y! Daddy go bang! bang!. Whenever P noticed a y, he would say
[, ] and point at it, or say [, , dada gu b b] thus clearly using
[] in the way y was used by the adults. Ps forms were identied with their
adult correlates in the way shown above and their semantic correlation is to be
taken as established; only the nonsemantic correlations are to be established
here. The words selected for study are the childs forms for nger [:/i:];
window [e:e:]; another [aa]; Randall [a
]; sh [/]; fetch []; vest

[]; dish [d]; brush [by], all in use at 1;6, and forms for y at 1;5 [w/b]
and at 1;6 [v//b]; barrow at 1;5 [ww] and at 1;6 [bw]; at 1;6 ower
[v/vw] and forms for Rooney [/h]; at 1;5 hymn/angel [ah /h/a
]
(see p. 73); and at 1;8 honey [ah u:]. Also the following at 1;6: biscuit
[be:be:]; bucket [bbu:]; pudding [pp]; Bobby [bbu:]; Kitty [tt]; and dirty
[d :t]. A detailed phonetic account is given of these forms but only a brief
64 Natalie Waterson
account of the phonology. A detailed phonological analysis of these forms is the
subject of a separate paper in which regular correspondences are shown between
child and adult structures and these correspondences are used to predict some
child forms and structures from the adults. (See Waterson 1970.)
It was found that correlations at the phonetic level could be stated between the
childs and adults forms by reference to the following:
1. Various features of articulation such as nasality, sibilance, glottality, stop
(complete closure), continuance, frontness, backness, voicing, voicelessness,
labiality, rounding, nonrounding. (A distinction is made between labiality
and rounding. Labiality is used to refer to the action of the lips as being
concerned in the articulation of a consonant such as lip protrusion which is part
of the articulation of initial [r], or the lip contact of the labiodental stricture of
[f]. Rounding, which includes labiality, refers to lip action which extends over
the syllable, e.g., in the second syllable of barrow, [bru].)
2. Grade of vowel opening.
3. The syllabic structure of the words.
4. The prominence of syllables (cf. Jones 1962: 55, who used the term prom-
inence in relation to sounds: the prominence of sounds may be due to
inherent sonority . . ., to length or to stress or to special intonation, or to
combinations of these).
In order to make a comparison between the childs and adults forms, the childs
forms are rst examined and analyzed into features and are then grouped into
different types of structure according to the selection of features which goes into
their composition, viz.
I. Labial Structure
II. Continuant Structure
III. Sibilant Structure
IV. Stop Structure, and
V. Nasal Structure.
It will be found that some features are common to all ve types of structure, others
to only three or two; but from the detailed descriptions given under each of the
ve headings, it will be seen that no one type of structure has the same selection of
basic features as any other type. For what is meant by basic features see below.
Type I. Labial Structure
The childs forms of y, barrow, and ower at 1;5 and 1;6 belong to this type.
y barrow ower
1;5 [w/b] [ww]
2
1;6 [/v/b] [bw] v/vw
They have the following features in common: labiality at the onset of each
syllable, [w, b, b, v, ]; continuance, [w, , b, v]; voiced onset of every syllable,
voiced ending of every syllable, broad degree of openness of vowel (as opposed
to closeness); prominence of one syllable, and the syllabic structure CV. These
features account for the similarity of these forms and such features will be called
the basic features. The structures may be symbolized as KVand KVKVat 1;5
and KV, KVKVand PVKVat 1;6 (K = continuant system, P = stop system).
Features which are not shared by all the forms but may be shared by some
and which account for the differences between them are as follows: friction,
[b, , v]; nonfriction [w]; affrication [b]; bilabiality [w, , b, b]; labioden-
tality [v]; stop [b]; centrality of the syllable [b]; rounding of the syllable [w];
backness of the syllable [w]; the ner distinction of frontness of syllable as
opposed to centrality, i.e., [w, b, , v] as opposed to [b] ([] in the
speech of the child and his mother is fully front, [] is advanced fromcentral but
is not fully front); word structure CVCVand CV. Such features as account for
differences of form will be called the differential features.
It is seen that the form for barrow at 1;6 has developed to a form closer to the
adult form and has the greatest number of differential features of all the child
forms belonging to this type of structure. The childs form [ww] at 1;5 had
more features in common with the other forms belonging to the Labial Structure
than his form [bw] at 1;6.
Type II. Continuant Structure
The childs forms for Rooney, honey, and hymn/angel belong to this type of
structure.
Rooney honey hymn/angel
1;6 [/h] 1;5 [ah /h/a
]
1;8 [ah u:]
The basic features of these forms are as follows: glottality, [h, ]; continuance
[h, ] ([h] and [] are analyzed as glottal continuants not as fricatives because
the stricture is at the vocal cords and there is no stricture in the supraglottal area
in common with other sounds classed as continuants sounds classed as
fricatives have supraglottal stricture); prominence of the rst syllable; voiced
onset of syllable i ([h] was a rare form and is grouped together with the other
forms under this type of structure because of its obvious similarity it is the
only one with voiceless onset); voiced ending of syllables 1 and 2; the disyllabic
structure of the word. The structure may be symbolized as VHV (H = glottal
continuant prosody, see p. 72).
The differential features of these forms are: voiceless onset in syllable 1
[h] and in syllable 2 [ah u: ah , h]; voiced onset of syllable 2 [, h,
a
]; frontness in [, h] and syllable 1 of [h]; backness in syllable 2 of

[ah u:, ah , a
]; centrality in syllable 1 of [ah u:, ah , a
]; nasality in [,
h] and in syllable 2 of [a
] (only the strongest nasality was recorded when

66 Natalie Waterson
transcribing the childs forms); the degrees of vowel openness, i.e., same grade
of vowel, mid, in both syllables, [, h]; more open vowel in syllable 1
followed by more close in syllable 2 [ah u:, ah , h, a
]; and the structure

HVHV.
If the basic features of this type of structure are compared with the basic
features of the Labial Structure, it will be seen that the feature continuance is the
only basic feature shared by them and that some features that are basic in the
Labial Structure, e.g., labiality and nonrounding, are not basic but are differ-
ential in the Continuant Structure. Thus features that are basic in one type of
structure are not necessarily basic in another type nor are the differential features
identical in different types of structure. What should be noted is that the
selection of basic features in the two types of structure is different
and this accounts for the difference in the basic shape and structure of the childs
words belonging to the two different types of structure. For what determines
which particular features will be basic and thus which words will be grouped
into the same type of structure see pp. 7883.
Type III. Sibilant Structure
The childs forms for sh, fetch, vest, brush, and dish belong to this type.
sh fetch vest brush dish
1;6 [/] [] [] [by] [d]
The basic features of these forms are as follows: broad degree of closeness (as
opposed to openness) of vowel [, , y]; voiced onset of word [, , b, d]; syllable
ending with the following features: voicelessness, sibilance, continuance, front-
ness, labiality, palato-alveolarity; monosyllabic word structure.
The differential features are: rounding of the word [, by]; nonrounding
of the word [, d]; backness of onset []; frontness of onset [, by, d]; ([b]
and [d] followed by front vowels have a more front quality than when followed
by back vowels); the ner distinction of close-mid vowel grade, i.e., [, ],
as opposed to close, i.e., [y]; bilabiality [b]; alveolarity [d]; word structures
VC and CVC. They may be symbolized as VS and PVS (S = sibilant system,
P = stop system).
The child has homonyms for sh and fetch and for sh and vest. This suggests
that the corresponding adult forms have many features in common.
It may be noted that the basic features of this type of structure are different
from those of the two types of structure already described.
Type IV. Stop Structure
The childs forms for biscuit, Bobby, pudding, bucket, Kitty, and dirty belong
to this type of structure.
biscuit Bobby pudding bucket Kitty dirty
1;6 [be:be:] [bbu:] [pp] [bbu:] [tt] [d :t]
The basic features are as follows: oral stop at syllable onset; voiced ending of
syllables; prominence of syllable 1; disyllabic word structure.
The differential features are: syllable features of frontness [be:, b, t]; back-
ness [p, bu:]; centrality [d :]; rounding [p, bu:, d :]; nonrounding [b, be:, t];
voiced onset [be:, b, bu:, d :]; voiceless onset [p, t]; labial onset [b, p, d ];
nonlabial onset [t]; the degrees of vowel openness, i.e., same (mid) grade of
vowel in both syllables [pp, tt, be:be:]; more open vowel in syllable 1 and
more close in syllable 2 [bbu:, d :t]; bilabiality [b, p]; alveolarity [d , t]. Some
of the structures are fully reduplicated monosyllables, i.e., (CV)
2
e.g., [be:be:,
pp, tt], some are partially reduplicated having only the consonants redu-
plicated, i.e., (C)
2
e.g., [bbu:], and one is reduplicated only as far as two
consonantal features are concerned, viz. stop and alveolarity, i.e., [d :t]. There is
thus a relationship of types of onset of the two syllables within the word, i.e.,
both syllables have onset with voice or with voicelessness, and with labiality or
with nonlabiality. Only one form does not conform to this pattern, viz. [d :t].
The word dirty was frequently used to the child from an early age, e.g., dirty
mouth, dirty hand said when washing the child after meals. The child learnt it
early, at 1;3, and used it long before [ :] was established in his disyllabic
structures and before the combination of voiceless and voicing features and
labiality and nonlabiality together with stop features within the same disyllabic
form was in use in his system. Final [] was used little in disyllabic forms at this
time apart from the form [tt] which is discussed later (p. 76). The form [d :t]
thus has the character of a loanword. Cf. Velten (1943: 284): words which
introduce a new sound have at rst the character of loan-words.
The child has homonyms for Bobby and bucket and the features of his forms
are identical. This leads one to suspect that the adult forms of Bobby and bucket
have many features in common with each other, perhaps more than with the rest
of the adult forms which correspond to the childs forms belonging to this type
of structure.
Type V. Nasal Structure
The childs forms of nger, window, another, and Randall belong to this type of
structure.
nger window another Randall
1;6 [:/i:] [e:e:] [aa] [a
]
They have the following basic features: nasality []; stop []; voiced onset of
the syllable; voiced ending of the syllable; prominence of the rst syllable;
syllabic structure of word CVCV.
68 Natalie Waterson
Differential features are: frontness of syllable [e:, e, i:,
, ]; centrality
of syllable [a]; length of syllable [e:, i:]; rounding of syllable [
];
nonrounding of syllable [e:, e, i:, , a]; same grade of vowel in both
syllables [e:e:, :, aa]; more open grade vowel in syllable 1 and more
close in syllable 2 [a
]; more close grade vowel in syllable 1 and more open in

syllable 2 [i:]. Some of the structures are fully reduplicated monosyllables,
i.e., (CV)
2
e.g., [e:e:, aa], some are partially reduplicated, i.e., (C)
2
e.g.,
[:, i:, a
]. The structures may be symbolized as follows: fully redu-

plicated (NV) (NV), partially reduplicated NVNV.
Each of the ve types of structure has a different selection of basic features,
i.e., the basic features account for the major structural differences between one
type of structure and another, and the differential features account for the ner
distinctions among the words within the one type of structure. Moreover, a
feature that is basic in one type of structure may be differential and not basic in
another and vice versa.
The fact that the childs words can be grouped into different types of structure
according to their basic features suggests that the corresponding adult forms
must also share some features among themselves and that features composing
the adult forms must bear some relation to those of the childs. The adult forms
corresponding to the childs are therefore examined to see what features they are
composed of and what features they have in common among themselves and
with the childs forms. In fact it will be seen that the adult forms can be grouped
under the same ve headings.
I. Labial Structure
All the adult forms corresponding to the childs belonging to the above structure,
viz. [a] y, [bru] barrow, [a:/aw] ower share the following features:
labiality [b, f, r, w]; the liquid feature, i.e., partially interrupted vowel-like sound,
[r, 1]; continuance [, r, w]; openness of vowel [, a, a:, a]; broad degree of
frontness (as opposed to backness) of the rst or only syllable [a, b, a:, a];
centrality of one or more syllables [a, a:, a, w, ru]; nonrounding and
prominence of the rst or only syllable; voiced ending of all syllables; syllabic
structure CV. For ease of comparison [] is analyzed as one complex unit,
labiodental fricative with lateral release. The above are therefore the basic
features of the adult forms belonging to the Labial Structure.
Features not shared by all the forms, i.e., the differential features, are as follows:
bilabiality [b, w]; labiodentality [f]; friction [f]; lateral release []; alveolarity [r];
stop [b]; voiceless onset of syllable []; voiced onset of syllable [b, r, w]; the ner
distinctions of syllable ending, i.e., front ending in [a] and [b]; back ending in
[ru]; close ending in [a, ru]; length of vowel [a, a:, ru], as opposed to
shortness of vowel [b, a, w]; the relationship of more open vowel in syllable 1
to more close vowel in syllable 2 in [aw] and [bru]; word structures CVand
CVCV.
Each of the adult forms belonging to this type of structure has ten basic
features and fewer differential features. The differential features may be shared
with some forms or with none, e.g., [a] has front ending of the syllable shared
with [b]; close ending shared with [r u]; lateral release shared with [a:] and
[aw]; length of vowel shared with [a:] and the second syllable of [br u];
labiodentality shared with [a:] and [aw]; friction shared with [a:] and
[aw], and word structure CV shared with [a:], i.e., it has eight differential
features compared with ten basic, [a:] has length of vowel shared with [a]
and syllable 2 of [br u] and the following features shared with [a] and
[aw]: friction, labiodentality, lateral release, and voiceless onset. It shares
word structure CV with [a]. It thus has six differential features, [aw] has
labiodentality, lateral release, voiceless onset, and friction all shared with [a]
and [a:], and bilabiality, the relation of more open vowel in syllable 1 and more
close in syllable 2, voiced onset of syllable 2 and word structure CVCV shared
with [br u], i.e., eight differential features, [br u] has the differential features
stop, rounding of syllable, alveolarity, and frontness (i.e., the ner distinction of
front as opposed to advanced from central); these are not shared with any of the
other words belonging to this type of structure and are four in number, [br u]
also has some shared differential features: bilabiality, voice at syllable onset,
the relation of more open vowel in syllable 1 and more close in syllable 2, and
the word structure CVCV, these being shared with [aw], and length of vowel,
i.e., in [ru], which is shared with [a] and [a:], i.e., a further ve differential
features making a total of nine differential features in all. [bru] has the greatest
number of differential features that are not shared and it seems that the differences
therefore stand out for the child as he makes a fairly quick adjustment to bring his
form closer to the adults, i.e., from [ww] at 1;5 to [bw] at 1;6.
As was expected, the adult forms corresponding to the childs share a large
number of features among themselves, viz. ten. Furthermore, if the childs basic
features and the adults basic features are compared, it can be seen that a number
of them are shared. These are: labiality at syllable onset, continuance, voiced
syllable ending, broad degree of openness of vowel, and the following features
of the rst or only syllable: broad degree of frontness, nonrounding, prom-
inence, and syllabic structure CV; i.e., eight features are shared by the childs
and adults forms. The childs forms do not have the adults basic features of
liquid and centrality, and the adults forms do not have the childs basic feature
of voiced onset of syllable.
If the differential features of the childs and adults forms belonging to the
Labial Structure are now compared, it is seen that there is some similarity here
also, e.g., friction which is differential for child and adult, is common to the
adults [a] y, [a:] and [aw] ower in []; it is also common to some of
the childs forms for y, viz. [, v, b] and to both forms of ower, viz. [v].
Labiodentality is common to the childs and adults forms for ower, adults [f],
childs [v]; the relationship of more open vowel of syllable 1 to more close
vowel of syllable 2 is common to the childs and adults forms for barrowat 1;6;
70 Natalie Waterson
bilabiality is common to the onset of the childs and adults forms for barrow at
1;5 and 1;6, childs [w] and [b], adults [b]; labiality at the onset of syllable 2 is
common to both, childs [w], adults [r ]; voiced onset of both syllables is also
common to both. At 1;6 plosive onset of syllable 1 and rounding of syllable 2
are shared by the childs and adults forms.
The fact that the childs and adults forms share a large number of basic
features seems to offer an explanation why all the childs forms for these words
are composed of those particular features and can be grouped into one type of
structure. It seems that the child perceives these particular features in the adult
forms and reproduces them. He also reproduces some of the differential features
which make the individual differences between the various adult forms belong-
ing to the one type of structure, so that his forms too are different from each
other except for [v], which is used both for y [a] and ower [a:], which are
very similar apart from the ending.
If one now examines how the features are combined in the childs and adults
forms, it is seen that they are not always combined in the same way, e.g.,
the onset of the adults form for y has the combination of features labiality,
labiodentality, friction, continuance, lateral release, voicelessness, viz. []. The
childs forms for y have onset of several different combinations of features, all
of which have voice, continuance, and labiality, and the additional features as
follows: labiodentality and friction [v]; bilabiality and nonfriction [w]; bilabia-
lity and friction []; bilabiality and affrication [b], i.e., the main differences are
in the type of stricture and in the vibration or nonvibration of the vocal cords.
In barrow, the adults [br u], childs [ww], childs and adults syllable 1
have onset with voice, labiality, and bilabiality but the adult has the stop feature,
[b], and the child has frictionless continuance, [w]. The childs and adults
syllable 1 also share the features voice, frontness, and openness of vowel.
Syllable 2 of the childs and adults forms have onset with voice, labiality
and frictionless continuance but where the adult has alveolarity, [r ], the child
has bilabiality, [w]. The childs form is a reduplicated monosyllable. From the
whole of the adult form [br u], he reproduces the consonantal features of
labiality, voice, and laxness, together with the open vowel grade and frontness
of the prominent syllable, giving the form [w] which is reduplicated to give a
disyllabic form. He does not reproduce the stop feature and it is possible that
he does not perceive it here (see p. 77). At 1;6 the child has acquired the stop
feature initially in the word in place of frictionless continuance, the relationship
of more open vowel in syllable 1 to more close vowel in syllable 2, the feature
centrality in syllable 1, and the features backness and rounding in syllable 2,
i.e., [bw]. In this his formfor barrowis moving away fromthe general pattern
of his Labial Structure words which had no stop feature and no backness and
rounding, i.e., his form for barrow has now acquired features which were not in
the composition of his Labial Structure words before, so that now his Labial
Structure has expanded to accommodate a wider range of forms. Thus at 1;6
the child seems to perceive and reproduce more features of the adult form of
barrow than he could before and now presumably the framework within which
he observes adult forms has been extended so that he is able to perceive more
and reproduce more than he was able to at 1;5. This then may be taken as an
example of how the childs structures expand and change, thus changing the
whole phonological system and bringing it closer to the adult system.
II. Continuant Structure
The adult forms corresponding to the childs forms belonging to this structure,
i.e., [r :n] Rooney, [hn] honey, [hm] hymn, [en
] angel, all share the

following features: continuance which is combined either with labiality or with
glottality (with labiality in [r ] of [r :n], [] and the fricative release of [
] of
[e n
], and glottality in [h] of [hn] and [hm]); nasality in the stops [n] and
[m] and a certain amount in the vowels. Where the word has voiced onset there
is fairly strong nasality over the word, e.g., [r :n ] and [e n
]; where the onset

is voiceless, the nasality is weak, e.g., [hn] and [hm]. Also common to all the
forms are voiced syllable ending, frontness of a syllable, vowel with broad degree
of closeness, and prominence of the rst or only syllable. The above features are
therefore basic for the adult forms belonging to this structure. Differential features
are as follows: glottality, labiality, alveolarity, bilabiality, affrication, sibilance,
the liquid feature, long vowel [u:, e], broad degree of openness of vowel (as
opposed to closeness) [, ]; the relation of vowel of syllable 1 to syllable 2 as
more open to more close, and more close to more open, and the following syllable
features: centrality, backness, voiced onset, voiceless onset, syllabic structures
CV, CVC, VC, and word structures monosyllabic and disyllabic.
It is seen that the adult forms share several basic features. Some of these basic
features are also common to the childs forms, viz. continuance, voiced syllable
ending, prominence of the rst syllable. The childs forms also have several
differential features in common with the adults, e.g., glottality, labiality, more
open vowel in syllable 1 followed by more close vowel in syllable 2, the syllable
features of rounding, nonrounding, backness, centrality, nasality, and CVsyllabic
structure and disyllabic word structure.
In the case of the adult forms belonging to this type of structure, the articulation
of the nasal stops is weak as they are found in weakly stressed positions in the
word, i.e., at the onset of unstressed syllables in [r:n] and [hn] and in syllable
nal position in [hm] and [e n
]. Also, in the latter case, the nasal stop is

followed by sibilance and this is a context in which the nasal stop is very weakly
articulated in the speech of the childs mother. The child does not reproduce the
nasal stops, so it is possible that he does not perceive them clearly in these
contexts, but he does reproduce the syllable feature of nasality, cf. adults [r:n]
and childs [ ]. In fact there are no strongly articulated consonantal features
in the adults forms belonging to this type of structure nor are there any in the
childs. It is possible therefore that the child does not perceive the consonantal
articulations clearly enough to attempt to reproduce them at this stage, but he is
72 Natalie Waterson
presumably aware of the disyllabicity (as the majority of his forms have the same
number of syllables as their adult correlates), and reproduces it with separation of
the syllables by glottal continuance. The glottal continuants are thus not part of
his consonantal system but act as a link between the two syllables.
As far as the syllabic structure of the words is concerned, there is a difference:
the childs forms all have the structure VHV (apart from the rare form [h]),
whereas the adult forms have the structures CVC, CVCV, and VCCVC.
Ps forms [ah /h/a
] were used with reference to angel and hymn. He

had a hymn-book with angels on the cover so that the words angel and hymn
were both often used in connection with it. P used the form [ahm] once for
hymn-book on June 22, 1960 and [ah ] and [h] for angel. On June 26, 1960,
he used [a
] for angel, pointing at the angels on the cover one at a time and
naming them. On the same day he used [b a
] for hymn-book, i.e., hymn/

angel-book. It seems that the words hymn and angel were not clearly differ-
entiated for him and so the forms were confused and features common to both
the adult forms were therefore used in his forms. The disyllabic form of the
childs words probably shows their relationship to the disyllabic adult form
[e n
] as the majority of monosyllabic adult words had monosyllabic corre-

lates in the childs forms. Adults [hm], with initial glottal continuant and nal
labial nasal stop, and [e n
], with nasality and stop and sibilant continuance

and with labiality in the second syllable, are reproduced by the child with the
following forms: [ah /h/a
], i.e., all having a medial glottal continuant, and

with the labiality feature in the form of rounding of the second syllable in two
cases as well. The adult form [hni ], with initial glottal continuant and no
labiality, is reproduced by the child as [ah u:], with medial glottal continuant and
with labiality in the form of rounding of the second syllable. There is some
correlation of the features nasality and nonnasality. There is nasality in the
childs forms of Rooney [/h] in common with the adult form [r u :n ], and
in one of the childs forms of hymn/angel, viz. [a ] in common with the adult
form [e n
], i.e., where the adult has voiced onset and heavy nasality, the
child has nasality in the word but where the adult has voiceless onset and weak
or no nasality over the word, the child has no nasality, e.g., adult [hn], childs
[ah u:]. It has already been shown that the words hymn and angel are not clearly
differentiated semantically for the child and it seems that they are therefore
not phonetically differentiated. It is possible to link the two forms without
nasality, [ah ] and [h], more closely with the adult form [him], which has
voiceless onset and little nasality over the word, and the form with nasality,
[a ], more closely with the adult form [e n
], with voiced onset and

stronger nasality. A correlation of vowel grade can be stated. In the adult
forms [hn] and [e n
], the vowel of the rst syllable is more open than

the vowel of the second and this difference of vowel grade is maintained by the
child, i.e., in [ah u:] and [ah /h/a ], the rst vowel is more open than the
second. In the adult form [r :n ] the vowels are close, [u:], and close-mid, [i],
and in the childs forms they are in the mid range, [e].
In words belonging to this type of structure it is seen that features common to
the childs and adults forms are not always in the same combinations, nor are
they always in the same sequence.
III. Sibilant Structure
The adult forms corresponding to the childs belonging to this type of structure,
viz. [fi] sh, [fet] fetch, [vest] Vest, [br] brush, and [di] dish, all share the
following basic features: broad degree of frontness of vowel; nonrounding;
labiality in [br, , f, v, t]; friction in [f, v, ], in the release of [t] and in the onset
of [st]; continuance in [f, v, ] and in the release of [t) and [br] and in the onset
of [st]; sibilance in [st, t, ]; and the syllabic structure CVC. [st] of vest is
analyzed as one unit, a checked sibilant, and [br] of [br] is similarly treated as
one unit, stop with liquid continuant release. The differential features are voiced
onset [vest, br, di]; voiceless onset [fet, fi]; labiodentality; bilabiality;
alveolarity; palato-alveolarity; stop and the liquid feature.
As expected, the adult forms have a large number of features in common and
here again is a reason why the childs forms belong to one type of structure. The
basic features shared by the childs and adult forms are labiality and voiceless
ending together with sibilance and frontness, and monosyllabic word structure.
Differential features that are shared are: frontness and nonrounding of the
syllable, viz. childs [i] and adults [fet], childs [] and adults [fi]; mid
vowel grade except in the case of brush, where the adult has open-mid [] and
the child has close vowel [y]; labial onset; some of the childs forms which
correspond to adult forms with labial onset and labial or nonlabial ending have
rounding throughout, viz. the childs [] and adults [f], childs [] and
adults [vest] and childs [by] and adults [br]; the stop feature in the forms
for brush and dish, and the syllabic structure CVC in the same two examples.
The child has a simple unit where the adult has a complex unit, e.g., childs [],
adults [st], childs [b] adults [br].
Where the adult form has onset with the stop feature, the childs form also
has onset with the stop feature; where the adult form has onset with labial
continuance, the child form has vocalic onset which in some cases is labialized.
It appears that when nonsibilant continuance (simple, not complex) occurs in
the same syllable as sibilant continuance, the child reproduces only the more
forcefully articulated sibilant continuance. This results in his forms having
vocalic onset where the corresponding adult forms have nonsibilant continu-
ance at the onset. As the initial stop features of the adults brush and dish are
reproduced by the child, one may conclude that they are easily perceived by the
child in spite of competition from the sibilant continuance. Stops are already
well established in the childs system but labiodental fricative continuants
are not; cf. Labial Structure words where the child used a variety of labial
continuants with different types of friction, or with no friction at all. In the adult
forms of sh, fetch, and vest, the initial fricative continuants are simple and
74 Natalie Waterson
comparatively weak articulations, viz. [f] and [v], and are not reproduced by
the child, so that his forms have vocalic onset; but in Labial Structure words
the initial fricative continuants of the adult forms are complex, viz. [], and are
more forcefully articulated and the childs corresponding forms have a con-
sonantal onset.
IV. Stop Structure
The adult forms corresponding to the childs belonging to this structure,
i.e., [bskt] biscuit, [bb] Bobby, [pd] pudding, [bkt] bucket, [kt] Kitty,
[d :t] dirty, have the following basic features: stop at syllable onset, syllable
with mid-vowel grade, front syllable, nonrounded syllable, and disyllabic
word structure. Several differential features are shared by the various forms,
e.g., bilabial onset of syllable 1 in [bb, pd, bkt]; nonbilabial onset of both
syllables in [kt] and [d :t]; voiceless onset of syllable: syllable 1 of [pd],
syllables 1 and 2 of [kt] and syllable 2 of [bkt] and [d :t]; voiced onset
of syllable 1 in all cases except [kt], and [pd]; mid grade vowel in both
syllables in [bskt, pd], and [kt]; more open vowel in syllable 1 and more
close in syllable 2 in [bb], [bkt], and [d -t]; rounding in syllable 1 of [bb],
[pd], and [d :t]; centrality of syllable 1 of [bkt] and [d :t]; syllabic
structure CV in ve forms and CVC in three forms.
In the Stop Structure bilabiality and nonbilabiality at word onset seem to
have an important role for the child. Where the adult form has bilabial onset in
syllable 1, the child has bilabial onset of both syllables; where the adult form
has nonbilabial onset in syllable 1, the child has nonbilabial onset in both
syllables. This is because the childs forms are mostly reduplications of the
rst syllable of the adult forms. Some are reduplications of the whole or of the
onset of syllable 1 of the adult form, i.e., full reduplication, where the vowel
grade of the adult form is the same in both syllables, e.g., childs [pp] from
adults [pd], where the features of the childs reduplicated syllable are
identical with syllable 1 of the adult form, and [be:be:], which has the features
of syllable 1 of adults [bskt] apart from the sibilant ending and the ner
distinction of vowel grade, i.e., the childs and adult forms have mid vowels but
the childs is open-mid and the adults is close-mid. Some of the childs forms
are partial reduplications of the adult forms, e.g., the childs forms for bucket
and Bobby, viz. [bbu:], where the consonantal onset of the rst syllable of
the adult form is reduplicated and the different vowel grades of adult syllable 1
and syllable 2 are maintained by the child. The childs and adult forms for
dirty are identical, so here the childs form is not a reduplication of part of the
adult form but it does have something of a reduplicative nature in that both
syllables have onset with alveolar stops. This may be the reason why the child
was able to imitate it successfully.
If the adult forms for which the child has homonyms, viz. [bkt] and [bb],
are compared, they can be shown to share a large number of features, e.g.,
disyllabic word structure; syllabic structure of syllable 1, viz. CV; bilabial and
voiced onset of syllable 1; stop at onset of syllables 1 and 2; voiced ending of
syllable 1; non-frontness of syllable 1; frontness of syllable 2; more open vowel
in syllable 1 and more close vowel in syllable 2; nonrounding of syllable 2. The
main differences are in the syllabic structure of syllable 2, i.e., CVC and CV, in
the rounding and nonrounding of syllable 1, and in the backness of syllable 1 of
[bb] as opposed to centrality of syllable 1 of [bkt]. The childs form [bbu:]
has all the features that are shared by the forms [bkt] and [bb] except that
the frontness and nonfrontness of the syllables are reversed, i.e., the child has
frontness of syllable 1 and nonfrontness of syllable 2. He also has nonrounding
of syllable 1 and rounding of syllable 2, which is the reverse of [bb]. In view
of the fact that [bkt] and [bb] have so many features in common, it seems
reasonable that the child should use the same form for them both. The analysis
shows that the child uses the same form for them because he perceives the same
features in them and not because of any similarity in the objects to which the
words refer or any lack of semantic differentiation.
Ps form [tt] is nowconsidered in relation to the adult form [kt]. From what
has been said earlier, it appears that he is able to perceive the difference between
onsets with bilabial and nonbilabial stops but it is not clear whether he is yet
able to perceive the difference between velar stop and alveolar stop at syllable
onset. At 1;5 he already had [g] at the onset of monosyllabic and disyllabic
words but [k] only at the ending of monosyllabic words. He had a wider range
of combinations of features in monosyllabic words than in disyllabic words.
This suggests that perception and reproduction are easier in shorter stretches
than long. It is likely that this is linked with syllable prominence, i.e., that the
child perceived prominent syllables more easily than the nonprominent. It is
probable that of the consonantal features of the disyllabic adult form [kt] it is
the features of stop and voicelessness at the onset of both syllables that the child
perceives most clearly, i.e., features that are reinforced by virtue of occurring in
two places in the word. As he has the combination of the features voicelessness
and stop only either with bilabiality or with nonbilabiality (alveolarity) in
disyllabic forms, he has to make a choice between bilabiality and alveolarity.
The fact that the second stop of the adult form is combined with alveolarity, a
combination already familiar to the child, no doubt helps him to perceive the
nonbilabial nature of the stops of the adult form and he therefore produces the
combination of features without bilabiality, viz. voicelessness, stop, and alveo-
larity for the consonantal element, together with the vowel grade and syllable
features of the adult form, this resulting in the reduplicated form [tt].
The second stop in the childs reduplicated forms bears some relation to the
second stop of the adult forms because the child only has reduplicated stop
forms as a reex of adult disyllabic forms with a stop at the beginning of each
syllable, cf. barrow, which in the adult form has a voiced bilabial stop at the
onset of syllable 1 and a labial continuant at the onset of syllable 2, and in the
childs form at 1;5 does not have stops but labial continuants at the onset of
76 Natalie Waterson
both syllables, [ww], i.e., it seems as if the stop feature has to occur in two
places in the disyllabic adult forms, that is to say it has to be reinforced, for the
child to reproduce it and reduplicate it.
V. Nasal Structure
The adult forms corresponding to the childs forms belonging to this structure,
i.e., [f
g] nger, [w
nd u] window, [r nd ] Randall, and [n] another,

have the following basic features: continuance [f, w, r , , ]; nasality combined
with the stop feature in [n, ] and in varying degrees over the word; nonrounded
syllable; voiced ending of all syllables; voiced onset of syllable 2; prominence
of penultimate syllable.
There are very many differential features and only those of special interest
are listed here to save repetition; they are as follows: nasal homorganic with the
following oral stop [g, nd]; labiality [f, w, d , r ]; more close vowel in syllable 1
followed by more open vowel in syllable 2 in [f
g] and [w
nd u]; more open

vowel in prominent syllable and more close in the following syllable in [r nd ]
and [n], i.e., [] followed by a lateralized mid labiovelar quality [] in
[r nd ], and [] followed by [] in [n]; nonrounding is shared by [] and
[n]; rounding in the second syllable is shared by [wi
nd u] and [r nd ].
It can be seen that the basic features nasality and stop are common to all the
childs and adults forms. Prominence of the penultimate syllable is also basic to
child and adult as are the following: nonrounded syllable, voiced ending of all
syllables, voiced onset of syllable 2. The adults basic feature continuance is not
reproduced by the child.
The nasals of the adult forms, apart from [n], are homorganic with the
following oral stops and are thus complex articulations and strongly articulated.
In [n] the nasal stop is at the onset of a stressed syllable and is also strongly
articulated. These strongly articulated nasal stops are reproduced and redu-
plicated by the child; cf. the weakly articulated nasal stops of Continuant
Structure words which are not reproduced by the child. In Nasal Structure
words the nasal stops are more forcefully articulated than the continuants and
it may be that they are therefore more clearly perceived by the child and hence
are reproduced by him.
As the differential features are many, a more detailed comparison is needed
to show the close relationship of the childs and adults forms. Prominence in
the rst syllable of the adult forms for nger and window, which have strong
nasality in addition to the other qualities which go to make a syllable prominent
(see p. 65, reference to Jones 1962), is matched in the childs forms by length of
syllable, i.e., the rst syllables of [e :e /i:] nger and [e:e:] window. The
second syllable of window in the childs and adults forms, although less
prominent than the rst syllable, has more prominence than the nal unstressed
syllable of their forms for nger, i.e., [e] and [] in the childs forms for nger
are less prominent than [e:] in syllable 2 of his form for window, and [g] in the
adults form for nger is less prominent than [d u] of window.
Acorrelation of vowel grade can be shown. Four grades of openness of vowel
are needed to describe vowels of the childs and adult forms being discussed
here: close [i:], close-mid [1], open-mid [, e, o], and open [, a, ], [u] is a
labial glide.
The childs forms belonging to this structure may be described as redupli-
cated structures. Some are fully reduplicated, i.e., [aa] and [e:e:], and
others are partially reduplicated, i.e., [e :e ], [i: ], and [a
o]. In the case

of [aa] it is the prominent syllable of the adult form that is reduplicated; in
the case of the rest, the strongly articulated nasal plus stop of the adult form is
reproduced by the child as a simple nasal stop and this is reduplicated to provide
the consonantal elements for the disyllabic forms. The vowel grades of the rst
and second syllables of the adult forms are maintained by the child in a broad
degree, as was shown above, but the syllable features are only partially main-
tained, e.g., in nger the child has frontness in syllables 1 and 2 where the adult
has frontness in syllable 1 and centrality in syllable 2, but both have non-
rounding in both syllables; in window the child has frontness and nonrounding
in both syllables but the adult has frontness and nonrounding in syllable 1 and
centrality and rounding in syllable 2.
Adult forms Child forms
[fi
g] 1st vowel close-mid, 2nd

open-mid, i.e., both broadly mid
and 2nd more open than 1st.
[e : e ]
[i: ]
Both vowels mid.
1st vowel close, 2nd close-mid,
i.e., both broadly mid and 2nd
more open than 1st.
[wi
nd u] 1st vowel close-mid, 2nd open-

mid, i.e., both broadly mid.
[e: e:] Both vowels mid.
[n] Penultimate vowel open, nal
vowel open-mid, i.e., both broadly
open.
[aa] Both vowels open.
[r nd ] First vowel open, 2nd syllabic a
labiovelar lateralized open-mid
quality, 2nd syllable has labial
quality.
[a
] 1st vowel open, 2nd open-mid,

2nd syllable rounded, i.e., has
labial quality.
The relationship between the childs forms selected for study and the corre-
sponding adult forms has now been demonstrated phonetically (by the shared
basic and differential features) and phonologically (by the childs and
adult forms being assigned to the same types of structure). In establishing this
relationship it was possible to observe which features of the adult forms the
child reproduces and to draw some tentative conclusions about what the child
is best able to perceive when learning to operate language. It has been found that
the childs linguistic perception at this early stage appears to be more limited
than his perception in imitation and repetition and he is best able to perceive
the generally broader distinctions and the most forceful articulations. He
appears to perceive an utterance as a whole unit and perceives certain features
78 Natalie Waterson
of the utterance but seems not to be always aware of the combinations and
sequence in which these features occur, cf. Ladefoged (1967: 149):
Listening to speech often requires the identication of differences in order which are
smaller than a syllable. Normal adults have no difculty in hearing the difference
between waist and waits or ts and st. But children and foreigners often make
mistakes of this kind; . . . If we use unfamiliar sounds it is easy to show that listeners can
differentiate between complex stimuli which differ in the order of their components, but
may not be aware of the differences in order. They differentiate between the stimuli as
wholes, and have to learn to interpret as order those cues which the ear transmits about
the relative times of arrival of the different parts.
To summarize, one may say that it seems that the child reproduces the features
of the adult form that he perceives most clearly and what he perceives most
clearly is (1) features that are already established in his repertoire and (2) the
most strongly articulated features and features that are reinforced in the utter-
ance, i.e., those that occur in more than one place in the utterance, and also the
broad distinctions rather than the ne.
The features that the child acquires the earliest in his phonological system
are presumably those that he perceives most clearly when he is listening linguis-
tically. Stop consonants which many children have been observed to acquire
earliest of the consonants can be assumed to be among the most clearly perceived
as they are the complete cutting off of the airstream, an extreme articulation in
comparison with the clear passage which is obtained with an open vowel, which
is observed to be the vowel commonly acquired earliest. This links up with the
views of Jakobson and others on the sequence in which sounds are acquired, but
it is not just a simple matter of acquiring sounds in a certain sequence because
if this were so, once a child had acquired a particular sound, he would use it in all
the places in which the adult used it; but this is not the case and hence we have
the problems of child language referred to earlier (p. 61), e.g., why a particular
sound is used in some words but not in others, why a child substitutes a sound
for one he is already able to make, why he drops sounds that he is able to make.
The answer to these questions may be as suggested earlier, i.e., that out of the
selection of features of which the utterance is composed, the child perceives
some more clearly than others and therefore reproduces those and not the others.
The features he perceives most clearly and those that he is able to reproduce
thus form the basis for his phonological structures, and the differences between
the childs and adult forms can thus be explained in terms of the childs limited
perception of the adult forms and the operation of his own phonological system
which results from his limited perception and limited ability to produce certain
features and combinations of features.
The hypothesis of the childs perception of the more strongly articulated
features and the broader distinctions generally as suggested by the analysis
presented in this chapter is in line with what was proposed by Leopold (1961:
352) when dealing with phonemic contrasts:
It is safe to assume that the small childs perceptive faculties develop gradually. When
the childs attention turns to language, it will rst distinguish in what it hears only the
coarser contrasts, and will need time to appreciate the ner sub-contrasts between the
sounds which reach its ear. The same applies to the efforts to reproduce the sounds in its
own articulation.
Furthermore, it seems reasonable to suppose that when a child is acquiring his
rst language, his perception of utterances is not conditioned to as great an
extent as that of the adult. The adults perception is conditioned by the context
of situation and by his linguistic competence, i.e., by the grammatical proba-
bilities, e.g., morphology and syntax, by lexical probabilities, e.g., his lexicon
and collocations, and by the phonological probabilities, e.g., probabilities
of combinations of sounds in his system, the rhythmic shape of words, etc.
(cf. Ladefoged 1967: 144, and Gimson 1964: 34), i.e., the adult has a high
expectancy of what is to follow. For a child the relationship between the utter-
ance and the context is in the process of getting established through the function
of utterances in context. He has only the rudiments of linguistic competence or
none at all. He is therefore listening with very little expectancy, i.e., at some-
thing like a phonetic nonsense level, and thus very likely reproduces those
features that strike him most clearly and those that he is best able to produce at
the time. To illustrate this point one may take Ps reproduction of the adults
strongly articulated nasal stops and the nonreproduction of the adults weak
continuants in his Nasal Structure words, and the nonreproduction of the adults
weakly articulated continuants and nasal stops in his Continuant Structure words.
Also the reproduction of labial continuance in his Labial Structure words where
the adult forms have complex, strongly articulated continuants, but the non-
reproduction of labial continuance in his Nasal Structure words, which in the
adult forms have strongly articulated nasal stops but weak labial continuants.
The concepts of substitution, elision, and metathesis have been used a great
deal in order to try and explain differences between child and adult forms. Many
linguists agree that there is some system and regularity about these phenomena
but nd it difcult to state the underlying rules for the regularity except in
terms of the debatable principle of the child using sounds involving the least
amount of effort and the principle of the use of the earliest acquired sounds
being substituted for those acquired later. Lewis (1968: 180185) notes certain
limitations in the range of articulations within which substitutions can be made.
Leopold discusses the various theories of substitution very fully (Leopold 1947:
25774) and tries to explain the irregularities he nds in terms of assimilations,
dissimilations, and metathesis which he considers upset the regularity of sub-
stitutions. He is thus able to account for several of his childs forms but is still
left with some that he cannot explain. However, the irregularities arise from
the nature of the analysis, i.e., because an independent phonological system is
not set up for the child and all the childs forms are interpreted in terms of the
adults phonological system. Another interpretation of some of the irregular
forms which shows them to be quite regular is given on pp. 846.
80 Natalie Waterson
A few examples of the sort of problems usually dealt with by the concepts
referred to above are nowtaken fromforms used by P in order to showhowthey
can more satisfactorily be explained by reference to the childs perception of
sounds and his phonological system. The forms have already been analyzed, so
now only the restriction on the use of certain sounds, viz. [v, w, , b], to specic
contexts is summarized and explained.
The sound [v]
It was seen that the child uses [v] in his Labial Structure words, e.g., initially
in one of his forms for y, [v], and for ower, [vw]. He does not,
however, use [v] in his form for vest, i.e., he appears to substitute [v] for [f]
in his form for y and ower but does not use it in vest where the adult does.
As he has no [f] in his system, it seems reasonable that he should have no
initial consonant for his forms of sh, [/], and fetch, [], but the question
that is usually asked is why when he is able to articulate a particular conso-
nant, e.g., [v], does he not use it wherever the adult does, e.g., in vest. The
reason for this is probably that as sibilant fricative continuance, here [st],
is more strongly articulated than nonsibilant fricative continuance, here [v],
he perceives the former more easily than the latter, and when both types of
continuant occur in the same syllable, as in vest, the child reproduces the type
that he perceives more clearly, i.e., the sibilant fricative continuant and not
the labial fricative continuant. This also accounts for sibilant continuance
being a basic feature for the structural type under which the words sh, fetch,
and vest are grouped.
In adult forms belonging to the Labial Structure there are only nonsibilant
continuants. These are complex articulations, e.g., [] in [a] and [a:], and
are fairly strongly articulated and there is no competition from any more
strongly articulated consonants, so the child reproduces the labial continu-
ance; but he has not as yet acquired the adult combination of the features
labiality, friction, and continuance in his system and therefore reproduces the
labial continuance with friction [v, ], or affrication [b], or without friction
[w], i.e., the sound [v] is not yet established in the childs systemand therefore
he is not likely to perceive it clearly and reproduce it, especially when it is
in competition with sibilance, which is more strongly articulated and thus
more easily perceived. The adult forms with stop initial and sibilant fricative
continuant nal, e.g., [br] and [d], are reproduced by the child with initial
stops, i.e., [by] and [d]. The stop feature was established early in the childs
system and the child appears to have had no difculty in perceiving and
reproducing it in competition with sibilant continuance within the same
syllable.
In view of the above, the sound [v] cannot be expected in the childs form for
vest but can be expected in his forms for words such as y and ower.
The sound [w]
P uses the sound [w] initially in some of his Labial Structure words, e.g., one of
his forms for y, viz. [w], and barrow, viz. [ww], but he does not use [w] in
his form for window. As has already been shown, he appears to perceive the
most clearly and strongly articulated features and those that are already estab-
lished in his system from among the selection of features of which his model
is composed. In Labial Structure words, the adult combination of the features
labiality, friction with lateral release, and continuance is reproduced by the child
as labiality and continuance accompanied by friction [v, ], or affrication [b],
or nonfriction [w]; here there are no other more strongly articulated features in
competition with the labial continuance so the child appears to perceive the
labial continuance and reproduces it. [br u] has the stop feature which may be
considered to be more strongly articulated than the nonsibilant continuant [r ].
For an explanation of why [b] is not reproduced by the child, see p. 83. In the
adult form [wi
nd u], the articulation of the complex [nd], i.e., homorganic

nasal and stop, is more forceful than the articulation of the noncomplex [w],
i.e., bilabial frictionless continuant. The child appears to perceive the stop
feature and the nasality more clearly than the weakly articulated continuant,
and reproduces the nasal stop (phonetically the homorganic nasal and stop may
be considered as one unit, a stop in which the soft palate is lowered at the onset
and raised before the release), and uses reduplicated nasal stops for the con-
sonantal elements of his disyllabic form. He does the same in nger and
Randall, where the initial noncomplex continuant articulation is less forceful
than the complex nasal and stop articulation, and thus the reduplicated nasal
stops are basic for a particular type of structure in his system, viz. the Nasal
Structure.
This explains why the child uses [w] in y but not in window.
The sound []
The child uses [] in some words where the adult has [n], e.g., window and
another, but not in others, e.g., Rooney and honey. As has been shown earlier,
the nasal stops in [wi
nd u] and [n] are strongly articulated and are therefore

reproduced by the child in reduplicated form to give a disyllabic structure, i.e.,
[e:e:] and [aa], but the more weakly articulated continuants [w] and [] are
not reproduced. In [r u :ni
] and [hn] the nasal stops are at the onset of unstressed

syllables and are thus weakly articulated. The continuants [r ] and [h] are also
weakly articulated so it seems that no consonantal features stand out clearly for
the child; but, as pointed out earlier, he does seem to perceive the difference
between monosyllables and disyllables, and perceives these forms as disyllabic
and reproduces them as such, the syllables being linked by glottal continuance
in the form of breath in honey, i.e., [ah u:], and breath and voice in Rooney, i.e.,
[]. The nasality of [r u :ni
] which is spread over the whole word in addition to

82 Natalie Waterson
the nasal stop, is reproduced by the child as a feature of the whole word, i.e.,
[], but the very weak nasality of [hn] is not reproduced. This accounts for
the establishment of the childs Continuant Structure for words in which the
adult forms have no strongly articulated consonants.
The use and nonuse of nasal stops in the childs forms corresponding to adult
forms with nasal stops is thus explained by reference to the childs perception of
the features nasality and stop in relation to the rest of the features of which the
adult form is composed.
The sound [b]
There are several examples of the use of the sound [b] initially and medially in
Ps speech. The use of medial [b] has been explained as resulting from the
reduplication of the rst syllable of adult disyllabic forms which have more
than one stop consonant, e.g., childs [be:be:], adults [bskt], childs [bbu:],
adults [bkt], so that there is no case for saying that the child substitutes [b] for
adult [k] in these examples.
Although the child uses initial [b] where the adult form has initial [b] (cf.
examples of Stop Structure words), an example has been given where he does
not do so, viz. in his form for barrow [ww] at 1;5. In view of the arguments
that the child perceives and reproduces the most strongly articulated consonan-
tal features out of the selection of features of which the adult form is composed
and also the features that are already established in his system, it would seem
that the child should have reproduced the [b] of [br u], as a stop is more
strongly articulated than a frictionless continuant (here [r ]) but the childs form
is [ww], i.e., with continuants not stops. This can be explained as follows:
the child has oral stops in disyllabic words only when the adult form contains
two or more stops; cf. the Stop Structure words. It seems therefore that the
stop feature has to be reinforced, i.e., occur in more than one place, for the child
to reproduce it in a disyllabic form. There is only one stop in [br u], i.e., [b],
and the articulation of this stop is lax and is combined with the feature labiality.
The onset of the second syllable is also lax and labial and is combined with
continuance, i.e., [r ]. The features of laxness, voice, and labiality seem to be
reinforced in the word and thus are apparently more clearly perceived by the
child than the stop feature, which, although well established in the childs
system, is not reinforced and is therefore not reproduced. The labiality, voice,
and laxness together with continuance are reproduced as [w]. Thus it seems that
the child perceived the same features in barrow as in y and ower. Hence his
form for barrow belongs to the same type of structure as his forms for y and
ower, viz. the Labial Structure, and did not have the stop feature until his
system had developed further.
It is possible to give rules that govern reduplication in the speech of the child
P. These are related to the different types of structure. In Stop Structure words
there is forward reduplication of the prominent syllable of the adult form
when the initial consonant of the adult form already functions as an initial
consonant in the childs system, e.g., bilabial stops. Cf. full reduplication, e.g.,
childs [pp], adults [pd]; childs [be:be:], adults [biskit]; and partial
reduplication in childs [bbu:], adults [bb] and [bkt]. (When reduplication
is described as being forward or reverse, the terms are used as convenient
labels to describe patterning in the childs phonological structure in relation to
the adult structure rather than any processes.) When the initial consonant of
the adult form does not function as an initial consonant in the childs system, the
childs form may be described as a reverse reduplication of the initial con-
sonant of the second syllable of the adult form, and of the grade of vowel and the
syllable features of the prominent syllable of the adult form, cf. childs [tt],
adults [kt]. Similarly in the Labial Structure, stops do not function as initials
in the childs system at 1;5, so there is reverse reduplication giving childs
[ww] for adults [br u]. In the Nasal Structure where the initial consonant
of the adult form does not function as an initial in the childs system, the childs
form is a reverse reduplication of the ending of the prominent syllable of the
adult form, e.g., the child has no [f] and therefore has [e:e/i:] for adults
[fi
g]. Where the initial consonant of the adult form functions as an initial in
the childs system belonging to a different structure, the childs form is also a
reverse reduplication (either full or partial), e.g., in the childs system initial [w]
and [r ] are used only in Labial Structure words (one of his forms for rabbit,
adult [r bt], was [r w]), therefore the consonantal elements of the childs
forms for window and Randall, which are Nasal Structure words, are a reverse
reduplication of the ending of the rst syllable of the adult form, i.e., [e:e] and
[a
], with the vowel grades and syllable features of the adult forms partially
maintained.
Quite a large number of substitutions in the examples of reduplications in
childrens speech given in the Appendices of Lewiss Infant Speech (1968) can
be explained in terms of reverse reduplication as described above. This sort of
reduplication has been observed before, but was interpreted in terms of assim-
ilation, e.g., by Leopold (1947).
By way of illustration of howthe approach used in this chapter can be applied
to another childs speech in order to explain phenomena that cannot be
accounted for by substitution, assimilation, etc., a few examples are taken
from Leopolds material (1947: 25774) which his theory could not explain,
to show how easily the irregularities can be explained by the type of analysis
suggested here. For instance, Leopold found it difcult to account for his
daughters [de] for steht (stands), [d] for stone and [l] for story, all
these being used by his child at 1;11. His daughter Hildegard was bilingual
English and German. He suggests [de] is steht with metathesis of initial [] or
steh- plus English -z. He compared [d] with [de] and at rst considered that it
had an incorrectly placed plural [z] (the word was used to refer to one stone), but
then decided it might be due to metathesis of initial [s]. However, fromhis excellent
phonetic records it is plain that at 1;10 and 1;11 his child had a type of structure
84 Natalie Waterson
with sibilant nal which had three different kinds of initial: (1) stop, (2) continuant,
and (3) nasal; cf. Ps Sibilant Structure at 1;6. The following is a representative
selection of the forms the child had, taken from Leopold (1939: 53137):
1. Stop initial and sibilant nal [bi] piece; [be] bathe; [ba]
beiss(en) (bite); [da] crash, dress, Katz (cat), kratzen (scratch), Glas
(glass); [d] kiss; [du] juice, Kuss (kiss). Cf. the adult forms which in
both English and German have the basic features onset with oral stop, and
fricative (generally sibilant) in nal position. Child and adult forms share
the basic features of voiceless and fricative ending; voiced onset is basic
for the child but not for the adult; sibilant ending is basic for the child but not
for the adult although most of the adult forms do have sibilance.
2. Continuant initial and sibilant nal [ha] heiss (hot); [hau]
Hause, house; [wa], [wa] waschen, wash; [w] abwischen (to wipe up); [ju]
lutsch(t) (sucks). Cf. the adult forms which have the basic features continuant
onset and fricative ending. Sibilant ending is common to most adult forms but is
nonbasic; it is basic for the child. The onset is voiced for child and adult except
where there is glottal continuance, e.g., [ha], heiss; voiceless ending is basic for
both in the rst or only syllable.
3. Nasal initial and sibilant nal [mau] mouse; [na] nice, knife;
[na] nass (wet); [ma] much; [mau] mouth; [n] nose. Cf. the adult forms which
all have nasal stop initial (basic) and fricative (basic), generally sibilant ending.
Voiced onset is basic for both the childs and adult forms. Voiceless ending is basic
for the child but not for the adult although it is common to most of the adult forms.
One may note that Hildegard maintains the distinction of labial and nonlabial
onset in all the examples quoted under (1), (2), and (3). Cf. Ps Stop Structure
forms where he kept this distinction.
Hildegards system appears to be based on adult forms having sibilant
fricative nal and initials with oral stop, continuant, and nasal stop. The
following irregular forms given by Leopold are now examined in relation to
the three types of structure set up for the childs phonological system: [de]
steht; [d] stone, and [l] story.
The adult form steht has the features (checked) sibilance and friction [t],
mid vowel [e], stop [t] and there is frontness over the whole word. The childs
form [de] has the features sibilance and friction [], mid vowel [e], stop [d], and
frontness over the whole word, i.e., it has features almost identical with those
of the adult form. In structures of the childs system with such a selection of
features, the stop feature comes rst and the sibilant feature last; cf. (1) above,
and the onset is always voiced whether the onset in the adult form is voiced or
voiceless, i.e., voiced onset is basic for the child. The sequence of sounds as in
the childs form [de] is therefore the only possible one to t her system and is
thus perfectly regular. If one examines the features composing the childs and
adult forms for stone, they are also found to be similar to each other. Adult
[stun] has the features (checked) sibilance and friction, [st], stop (nasal), [n],
more open vowel followed by more close, [u], and rounding of the whole
word. There is voiceless onset and voiced ending. The childs form [d] has
the features sibilance and friction [], stop (oral) [d] (nasality is not a basic
feature of the childs stop initial and sibilant nal structures, so is not relevant
here; it is only relevant when onset with nasality plus stop is basic in the adult
form), more open vowel followed by more close, [], and rounding and voicing
at the onset of the word and nonrounding and voicelessness in the ending. As
noted above, the childs systemrequires the stop to be initial and the sibilance to
be nal, with voiced onset and voiceless ending. The vowel grades are the same
as in the adult form, viz. more open followed by more close, but the rounding
feature does not extend over the whole of the childs form. The form [d] is
thus just as regular as the form [de] and the other forms listed under (1) above.
The form [l] for story puzzled Leopold so much that he doubted the
interpretation of the word as story, although the context in which it was used
seemed to suggest it: it was given as the answer to the question was hat Mama dir
erzhlt? (what did mummy tell you?), the answer being [ l], which Leopold
interpreted as a story but said that if it was story, it was quite irregular as none
of his patterns of assimilation, etc., could explain it. However it is possible to
showthat it shares many features with [st:r] and as far as the childs own system
is concerned, is quite regular. The adult form [st:r] has the features (checked)
sibilance and friction, [st], liquid and continuance, [r], more open vowel followed
by more close, [] and [], and rounding and backness in the rst syllable and
nonrounding and frontness in the second, with voiceless onset and voiced ending
of the word, i.e., it has the same features as the adult forms grouped under
(2) continuant initial and sibilant nal, but in a different sequence. The childs
form [l] has sibilance and friction, [], liquid and continuance, [l], more open
vowel followed by more close, [] and [], rounding and backness in the rst
syllable and nonrounding and frontness in the second; all these features are
shared with the adult form, but the childs form has voiced continuant onset
and voiceless sibilant ending, which is a different sequence from the adults but
is required by the childs system as seen in (2) above. The childs form
thus shares most of the features of the adults while conforming to her own
system referred to above, viz. continuant initial and sibilant nal structure. This
form is thus considered to be completely regular. In none of these cases is it
necessary to bring in the concepts of substitution or metathesis to explain
the differences between the childs and adult forms. The childs forms conform
to the patterns of her own phonological system and, as noted in the case of P, the
sequence and combinations of features in the childs and adult forms are not
always the same.
It is possible that the perception of phonetic features that have been
described in articulatory terms is, in fact, some kind of perception of acoustic
86 Natalie Waterson
cues similar to what Fry suggests for the development of the phonemic system
(1966: 197):
It is clear that a very important part of this development of the phonemic system is bound
up with the use of acoustic cues, both for monitoring of the childs own speech and for
the reception of other peoples. We now have a considerable body of information about
the operation of these cues in adult speech, although we are still far from understanding
fully how they function, but have no knowledge of the ways in which the use of the cues
develops as speech is acquired.
It is possible, also, that there may be some parallel in the perception of phonetic
features (whatever their nature) with what Piaget calls verbal syncretism
(1967: 1312), he writes:
Recent research on the nature of perception particularly in connexion with tachistoscopic
reading, and with the perception of forms, has led to the view that objects are recognised
and perceived by us, not because we have analysed them and seen them in detail,
but because of general forms which are as much constructed by ourselves as given
by the elements of the perceived object, and which may be called the schema or the
gestaltqualitt of these objects. For example, a word passes through the tachistoscope
far too rapidly for the letters to be distinguished separately. But one or two of these letters
and the general dimensions of the word are perceived, and that is sufcient to ensure a
correct reading. Each word, therefore, has its own schema.
Piaget considers that such schemata are far more important for the child than
for the adult, as they develop long before the perception of detail, the natural
course of development being from syncretism to a combination of analysis
and synthesis, and not from analysis to syncretism. It thus seems reasonable to
consider that a child perceives some sort of schema in words or utterances
through the recognition of a particular selection of phonetic features (the basic
features) which go into the composition of the forms of the words or groups of
words, and this recognition of a schema results in his producing words of the
same type of structure for such adult forms, e.g., words with consonantal
features continuance and strongly articulated nasal followed by stop have in
the forms of the child P a reduplicated nasal stop pattern, i.e., the Nasal
Structure. A child also recognizes differences in form within the particular
type of structure (the differential features) and this results in his having
different forms within the Structure, and as his skill in perception and articu-
lation increases, so he perceives and reproduces more and more of the features
of the adult forms. Such a hypothesis seems to link up with what appears to be
currently a widely accepted view of the cognitive development of the child,
i.e., starting with a comparative lack of differentiation and progressing by way
of increasing differentiation. This view may be briey illustrated by the words
of Brown (1958, reprint 1968: 89):
the primitive stage in cognition is one of a comparative lack of differentiation. Probably
certain distinctions are inescapable; the difference between a loud noise and near silence,
between a bright contour and a dark ground, etc. These inevitable discriminations divide
the perceived world into a small number of very large (abstract) categories. Cognitive
development is increasing differentiation. The more distinctions we make, the more
categories we have and the smaller (more concrete) these are. I think the latter view is
favored in psychology today,
and (1958, reprint 1968: 91):
Psychologists who believe that mental development is from the abstract to the concrete,
from a lack of differentiation to increased differentiation, have been embarrassed by
the fact that vocabulary often builds in the opposite direction. This fact need not trouble
them, since the sequence in which words are acquired is not determined by the cognitive
preferences of children so much as by the naming practices of adults.
In the analysis presented in this chapter, the adult forms, like the childs,
were grouped into ve types of structure on the basis of the particular selection
of features which they have in common and these were suggested as the
schemata of the words which the child perceives. Perhaps the rst reaction
to such a classication will be to ask whether it is not always possible to nd
enough common features among words to group them into any type one may
wish. This may be so when one is dealing with the adults whole lexicon, but
the child is building up his phonological system from nothing, i.e., one may
consider his competence to be nil at the start (that is to say at the time when he
rst begins to understand what is said to him, not when he rst begins to
talk), and it seems that the basis on which he builds is the input he receives,
i.e., utterances which are meaningful to him by their function in context, and
these at the start are few in number. He therefore has little or no expectancy
and no conditioning to inuence his perception of sounds until he gets some
system built in, i.e., gets some competence. Thus it seems that it is the
selection of features composing the utterances which are the input for the
child that determine the patterns he will acquire, and the input is decided more
by the adults than by the child (see Brown, above). This means that the
sequence in which he registers various utterances will determine which
features he will learn to perceive and reproduce rst and will thus determine
the different types of structures he will have in his phonological system. One
may take as an example the words Randall, window, and nger which were all
used frequently to P. He sees them operating in context. They thus become
meaningful for him and therefore claim his attention. He appears to perceive
certain features common to them, i.e., nasal stop which is forcefully articu-
lated, broad grades of vowel openness in syllables, certain syllable features,
and reproduces these features in a particular way, e.g., reduplicated nasal
stops, etc., and thus he has a new type of structure.
Every child has a different input as different children have different
environments and different things are said to them. This means that it is
possible for every child to register a different set of words and perceive
some similarity in the selection of features of different groups of words,
thus perceiving and reproducing different sets of features. This will result in
88 Natalie Waterson
different kinds of structures in their phonological systems so that children
learning the same language will have different forms. This does not of
course mean that there cannot be similarity; cf. Ps and Hildegards Sibilant
Structures. Although it has been noticed that there is a tendency for children
to acquire certain sounds earlier than others (see references to Jakobson and
others on p. 62), i.e., those of which the articulation does not require great
skill in timing and coordination, e.g., stops and nasals and an open
unrounded vowel, this does not mean that their phonological patterns will
be the same; in fact they are usually different, and that is why when
children rst begin to speak, they are often not understood by speakers of
the same language outside the family.
It is not possible on the evidence of the analysis given in this paper to suggest
whether in the very early stages a child rst observes similarity in the feature
selection of several words before he attempts to reproduce them in speech or
if he perceives particular features in each word independently and reproduces
them. Whatever way it happens, the child produces similar forms, i.e., with the
same basic features, and thus a type of structure with a particular selection of
features in a particular sequence becomes established in the childs phonolog-
ical system. Once such a structure is established, he has a framework within
which he perceives other utterances which have the same selection of features,
i.e., he has some competence which gives hima certain expectancy. This may be
illustrated from the structures of Leopolds child with nal sibilant fricative
quoted on p. 85 which were obviously based on adult forms with nal sibilant
friction. Other words with the same basic features but in a different sequence,
i.e., with sibilant onset instead of ending, e.g., story and stone, were reproduced
by her with features in the same sequence as the rest of her words belonging
to that particular structure, i.e., with sibilant ending, and not in the sequence
found in the adult forms, thus her competence conditioned her performance.
Presumably when a childs perception sharpens and the input includes more
adult forms with the same selection of features but in a sequence different from
the one on which the childs structure is based, the childs structure expands to
include the new sequence.
The writer has not yet made a thorough study of Ps acquisition of grammar
but has reason to believe that it is possible to show that the grammar of the
language was acquired in a similar way, i.e., he observes an utterance as a whole
and perceives certain basic features of grammatical structure in the utterance
which are linked with stress and prominence. It is mostly the stressed and
prominent words of a sentence that a child reproduces so that many unstressed
words are left out, hence the telegraphic effect, to borrow a term from Fraser,
Bellugi and Brown (1963, reprint 1968: 50). These then are the basic features
or units on which his sentence structures are built. Such basic units of grammar
are established on the basis of regularly recurring structures which can easily be
related to the context by the child, i.e., such as are functional for him, e.g., in the
case of P such sentences as Bobs a good boy, said to the dog when he does as he
is told, and Annes a good girl, said to the goat at milking time. Ungrammatical
or anomalous sentences are unlikely to play a part because they do not recur
often enough for a child to register them. It is possible, therefore, that a child
perceives certain basic patterns of regularly used sentence types, i.e., the
schemata of the sentences. These are reproduced by him and he uses such
patterning as a model for his own sentences; cf. Ps reciting of sentence patterns
which are obviously his own creations, e.g., on 11.7.60 he said [n g g:, dada
gg:, ba: gg:] (Annes a good girl, Daddys a good girl, Bobs a good girl).
These are apparently modelled on Annes a good girl. As a child gains more
experience and as his phonological system develops and he is able to perceive
more, he appears to grow more aware of ner grammatical distinctions,
many of which occur mainly in unstressed positions in the utterance, such as
prepositions, conjunctions, weak forms, gender concords, etc., which in the
early stages are not reproduced and probably are not so clearly perceived; they
get gradually incorporated into his sentence structures and so his grammatical
system grows. The basic units of sentences can be expected to vary to some
extent from one child to another, as what would be basic for each child would
depend on the type of sentence structures which were the input for him; but
it seems that more similarity can be expected in the basic units of grammatical
structures than in phonological structures of English children because
English-speaking adults seem to use the same sort of sentence structures to
children in the main so that they have mostly the same sort of structures as
input, e.g., simplied grammatical structures such as Mummy do it, wheres
pencil?, Baby want Teddy?.
The study of the pattern of the acquisition of the grammar requires a separate
paper and the brief comment on the subject is only put forward here as the
obvious corollary of the pattern of the acquisition of the phonological system
of the child P, thus showing that the pattern of the acquisition of grammar and
phonology seems to be a coherent whole. It is somewhat rash to put forward
speculations about the acquisition of grammar before the grammatical study is
complete but they are made in the hope that those concerned with problems of
language acquisition will be provoked either to support the views expressed
here or to offer reasoned arguments to disprove them.
At the present time there is much speculation about what constitutes a childs
capacity for language acquisition, e.g., Chomsky (1965: 362, 1966: 11113,
1967: 397442), Katz (1966: 24082), McNeill (1966: 6585), Lenneberg
(1965, 1966a, 1966b, 1967). The evidence given in this chapter suggests that
P perceived some sort of schema through the recognition of a particular set of
features out of the selection of features of which groups of adult forms were
composed, and this resulted in his producing his own related forms with one
structural pattern. If this proves to be the general pattern of how a child acquires
the phonological system of his mother tongue, it may be that part of a childs
capacity for the acquisition of the phonological system is the ability to perceive
schemata in the sound patterns of utterances.
90 Natalie Waterson
notes
1. [d ] = labialized [d]; [ ] = labialized [].
2. The sign stands for no recorded form. The rst recorded form for ower was at
1;6.
References
Bellugi, U. and Brown, R. (1964). The acquisition of language. (Monograph of the
society for research in child development, serial no. 92. 29. 1.) Lafayette, IN: Child
Development Publications.
Brown, R. (1938). How shall a thing be called? Psychological Review, 65, 1421.
Reprinted in Oldeld and Marshall (eds.) (1968).
Carroll, J. B. (1961). Language development in children. In Saporta (ed.) (1961).
Chomsky, N. (1965). Aspects of the theory of syntax. Cambridge, MA: MIT Press.
(1966). Current issues in linguistic theory. The Hague: Mouton.
(1967). The formal nature of language. In Lenneberg (1967), pp. 397442.
Cohen, M. (1969). Sur ltude du langage enfantin. Enfance, 34. Paris.
Fodor, J. A. and Katz, J. J. (eds.) (1965). The structure of language. Englewood Cliffs,
NJ: Prentice-Hall.
Fraser, C., Bellugi, U., and Brown, R. (1963). Control of grammar in imitation, com-
prehension and production. Journal of Verbal Learning and Behavior, 2, 12135.
Reprinted in Oldeld and Marshall (eds.) (1968).
Fry, C. B. (1966). The development of the phonological system in the normal and the
deaf child. In Smith and Miller (eds.) (1966), pp. 187206.
Gimson, A. C. (1964). An introduction to the pronunciation of English. London: Edward
Arnold.
Grgoire, A. (1933). Lapprentissage de la parole pendant les deux premires annes de
lenfance. Psychologie du langage, ch. 5. Paris.
(1947). Lapprentissage du langage. vol. 2, Facult de philosophie et lettres. Lige.
Fascicule CVI. Librairie E. Droz, Paris 25, rue de Tournon.
Ingram, T. T. S. (1966). Syntactic regularities. General discussion. In Lyons and Wales
(eds.) (1966), pp. 21419.
Jakobson, R. (1941/1968). Child language, aphasia and phonological universals, trans.
A. R. Keiler. The Hague: Mouton. (Originally Published as Kindersprache, Aphasie
Jakobson, R. and Halle, M. (1961). Phonemic patterning. In Saporta (ed.) (1961).
Jones, D. (1962). An outline of English phonetics, 9th edn. Cambridge: Heffer.
Katz, J. J. (1966). The philosophy of language. New York and London: Harper & Row.
Ladefoged, P. (1967). Three areas of experimental phonetics. London: Oxford University
Press.
Lenneberg, E. H. (1965). The capacity for language acquisition. In Fodor and Katz (eds.)
(1965), pp. 579603.
(1966a). Abiological perspective of language. In E. H. Lenneberg (ed.), New directions
in the study of language, pp. 6588. Cambridge, MA: MIT Press.
(1966b). The natural history of language. In Smith and Miller (eds.) (1966), pp. 21952.
(1967). Biological foundations of language. New York, London, and Sydney:
J. Wiley & Sons.
Leopold, W. F. (1939). Speech development of a bilingual child, vol. 1 (1954 reprint).
Evanston: North Western University Press.
(1947). Speech development of a bilingual child, vol. 2. Evanston: North Western
University Press.
(1961). Patterning in childrens language learning. In Saporta (ed.), (1961), pp. 3508.
Lewis, M. M. (1968). Infant speech: a study in the beginnings of language. London:
Routledge & Kegan Paul.
Lyons, J. and Wales, R. J. (eds.) (1966). Psycholinguistics papers (Proceedings of the
Edinburgh Conference 1966). Edinburgh University Press.
McNeill, D. (1966). The creation of language by children. In Lyons and Wales (eds.)
(1966), pp. 99132.
Ohnesorg, K. (1959). Druh fonetick studie o dtsk ei. Bmo: spisy university v Brn
losock fakulta. 57.
Oldeld, R. C. and Marshall, J. C. (eds.) (1968). Language. Harmondsworth: Penguin.
Piaget, J. (1967). The language and thought of the child, trans. M. and R. Gabain.
London: Routledge & Kegan Paul.
Saporta, S. (ed.). (1961). Psycholinguistics: a book of readings. New York: Holt,
Rinehart & Winston.
Smith, F. and Miller, G. A. (eds.) (1966). The genesis of language: a psycholinguistic
approach. Cambridge, MA, and London: MIT Press.
Velten, H. V. (1943). The growth of phonemic and lexical patterns in infant language.
Waterson, N. (1970). Some speech forms of an English child: a phonological study.
Transactions of the Philological Society, 3450.
Weir, R. H. (1962). Language in the crib (Janua Linguarum. Series maior, XIV). The
Hague: Mouton.
92 Natalie Waterson
4 Words and sounds in early language acquisition
Charles A. Ferguson and Carol B. Farwell
In acquiring full control over the language of his speech community, the child
must learn to deal with an enormous array of lexical and phonological
elements, as well as with the complex relations among these elements which
constitute the grammar of a particular language, different from all other
possible languages. In addition to the machinery of the language itself, he
must learn when and how to use the language in accordance with his own
needs and the norms of the community. And all this confronts the child not in
neat, separate units, but in conglomerate batches which he must largely sort
out for himself. Even if the speech input to which he is exposed is restricted in
scope and simplied in structure, as the talk addressed to young children tends
to be, the analytic problem is severe, and it must not be expected that the
childs early attempts will match with any great precision the adults language
behavior and its underlying principles of organization.
Thus the linguist who wishes to identify analytic units in the childs
speech encounters even greater pitfalls than he does in abstracting from
the adults speech those components at various levels which merit analytic
autonomy. Looking for distinctive features, inectional categories, syntactic
rules, and all the dozens of other possible basic units in a childs linguistic
system is a hazardous pastime; yet if we are to understand the processes of
language development indeed of language behavior in general we must
make the effort to do so, since it is manifestly impossible to deal with the
childs language in one large undifferentiated mass.
1
In the present study, we
examine the language development of the child in terms of two putative
units: words and word-initial consonants. In the description and analysis
which follow, no assertion is made that these units are independent of all
other possible units, or that recognition of these two units precludes recog-
nizing certain other possible units (e.g., morphemes/formatives, syllables,
sentences, prosodies, schemata, idioms, distinctive features, rules,
agreement . . .). What is assumed is twofold: (1) words and word-initial
consonants are valid units of analysis from the earliest productions of
meaningful speech by the child, and (2) it is instructive to study these two
units in relation to each other.
93
1. Data
The data used here are a small part of those collected in a longitudinal study of
seven children, conducted as a part of research on the development of conso-
nants in rst-language learning.
2
The children, four girls and three boys of
monolingual English background, were selected for the study when they were
reported by their parents to use several words. Ages at the beginning of the
study ranged from 0;11 to 1;2.
1.1. Procedure
Each child was visited at home at approximately weekly intervals for seven to
ten months, with occasional larger gaps because of illness and family vacations.
Three observers participated in the project, two attending each session when
possible. For about half the sessions, only one observer was present, but each
child was seen consistently by the same observer.
During each half-hour visit, attempts were made by parents and observers to
elicit as many of the childs words as possible, by the use of picture books and
things familiar to him (food, toys, etc.). The sessions were tape-recorded, and
notes were made by the observer(s) of the probable adult equivalent of each
utterance. Utterances were considered meaningful if there was sufcient con-
sistency to allow recognition of the form, and if there was some consistency in
reference or accompanying action not necessarily exactly that expected from
the meaning of the adult word. Similarly, it was not required that a specic adult
English equivalent should be identied. Occasionally, it was found that a child
would consistently use a form for which no probable adult equivalent could
be imagined. In fact, however, such uninterpretable words occurred much
less frequently than expected. They were included in the data, as well as
forms which seemed to correspond to whole adult phrases rather than words,
e.g., I see you.
Identication of words was aided by parents recognition, although observers
often obtained evidence of the use of a particular word before parents noticed
it. We assume that our judgment of the identity of meaningful forms is valid.
McCurry and Irwin (1953) demonstrated 91 percent inter-observer agreement in
the determination of meaningful utterances and their referents in naturalistic
settings, and our agreement in sessions attended by more than one observer was
similarly high.
Child utterances were transcribed using the techniques established by the
Phonetics Workshop of the Child Phonology Project, Fall 1971, and problems
were referred to that workshop. An expanded IPA symbol grid was used
(Johnson and Bush 1972). Transcription is to a level comparable to that in
Leopold (193949), with narrower transcription of initial consonants and less
attention to vowels.
94 Charles A. Ferguson and Carol B. Farwell
1.2. Subjects
This chapter reports the early stages of development of two girls, Tand K, from
the larger study. Utterances occurring from the beginning of the study to the
week in which the ftieth word type was recorded are included.
3
In order to
provide a reference point for our analysis, Hildegard Leopold (H) has been
included as a third subject, since information about her development is widely
known and generally available.
T was a rst child and spent almost all her time with her two parents. Her
mother kept a detailed list of words produced by T during each week, and
conscientiously elicited new words for us at each session. T had just begun to
walk when we started our study. She did not engage in much babbling, and she
imitated only infrequently usually words she had already produced herself.
Pivot-type syntax, especially with the words hi and where, was evident from the
rst sessions; and two-word utterances became more common soon after the
session with the ftieth word.
K had an older brother and, possibly in self-defense, was physically aggres-
sive and active. She spent time with both her parents and a housekeeper, and
was often left alone with investigators during a taping session. Our tapes of K
contain a lot of babbling or at least unintelligible speech, and she showed
willingness to imitate almost any word beginning with a sound at least close to
one she could say. Even during the rst sessions where our data are scanty, she
would occasionally imitate or even spontaneously say three-word sentences;
and our general impression was that she was more adventuresome and less
concerned with details than T and H.
H, a rst child, was deliberately raised as a bilingual: her father spoke to her
only in German, her mother in English. She spent two months (age 1;0 and 1;l)
in Germany where even her mother spoke only German to her, and for some
time on her return she did not understand English. To make the Leopold data
comparable, only the words which H still said at age 1;0 were included in the
study; but those words are followed from their beginnings, back to 0;10. She
imitated very rarely always words which she understood. Until 1;5, the last
month considered here, many of Hs words occurred only in whispered form,
although some had full voice from the beginning, and a whisper/voice distinc-
tion sometimes separated homonyms. H learned to walk in the second half of
1;1, a month and a half after the beginning of the period studied here. On the
whole, H was cautious: It was characteristic of her that she generally avoided
altogether any words the meaning and form of which she could not successfully
cope with (Leopold, I.172).
4
The children and the number of sessions reported here are shown in Table 4.1.
Because no natural criteria present themselves for grouping weekly sessions
together, each session has been analyzed separately. For H, grouping is done
month by month, since Leopold tells us only the month in which each form
occurred. The main effect of the use of larger time divisions with H is that
Words and sounds in early language acquisition 95
uctuations from day to day are likely to be lost in the general trend of
development. This tendency coupled with the fact that Leopold often reports
only a fewphonetic variants of a word during a month, while one of our children
might produce as many as eight variants of a word in one session tends to
make Hs progress look much smoother than that of the other two children. Far
from making the two sources of data incompatible, such a difference can be put
to good use: Hs development can help us recognize overall trends within the
variant forms in our data, while our data can make clear the degree of simpli-
cation in the H data.
1.3. Imitations and other problems
In a study of child phonology, as in any other phonological work, it is common
to exclude certain problematic forms of data from analysis. For example,
utterances in which a child imitates or echoes an immediately prior adult
utterance are often separated from other, spontaneous utterances. Researchers
have sometimes found that such imitations may be more accurate phonetically
than the same forms said spontaneously; and they have excluded imitations in
order to maximize the number of utterances processed by the childs phono-
logical system, rather than by a separate imitative ability.
There are several reasons why we have not excluded imitations from analysis
in this study. For one thing, a very high percentage of what a one-year-old says
is imitated, so that there is very little purely spontaneous data. Furthermore, a
study of the forms collected shows that a separation of imitated from sponta-
neous forms, where the two can be compared, does not correspond in any
straightforward way to a separation of different forms of the same word.
Finally, even children this young can repeat or imitate things said by adults at
some distance of time ve minutes or more despite considerable intervening
speech, so that no simple denition of imitation is feasible. Hence a separation
of imitated utterances has not been carried out here, since it would lead to a great
reduction of available data without any demonstrable gains of accuracy or
homogeneity although such a separation might be methodologically sound
when dealing with older children, where data are not so limited. (For discus-
sions of this whole question from different points of view, see Templin 1947,
Olmsted 1971: 945, and Edwards and Garnica 1973.)
Table 4.1. Periods of elicitation for the three subjects T, K, and H
Child Age at beginning Number of sessions Time span Total no. of words
T 0;11 9 13 weeks 51
K 1;2 13 13 weeks 72
H 1;0 6 months 54
Several kinds of data have been excluded, however. In order to make the three
children comparable, forms which Leopold himself questions or which H
seemed to repeat once have been excluded, as well as exclamations which
probably would not have been collected from our children. Some of Hs words
have been included several months later than Leopold rst lists them. Similarly,
marginal forms such as mmm, hm-m, tsk-tsk, etc., as well as onomatopoeic
words in which imitative qualities obscure the segmental phonology, have been
left out in all three children. However, Hs sch-sch has been included because of
its conventional referential meaning although it is extremely marginal phono-
logically, the [] being syllabic and not occurring before a vowel, like other
consonants.
5
Finally, certain forms have been included even though they present problems
for the analysis of word-initial consonants. A short listing of three cases in
which this occurs may help explain some of the variation observed:
(a) Backgrounding: the word-initial consonant is deleted or drastically reduced
when the child is working on another part of the word (for full discussion
of trade-off phenomena in phonological development, see Edwards and
Garnica). One example from our data shows two forms of a word: (T IX)
milk [b, k
].
6
(b) Assimilation and syllable deletion: here a word-initial consonant is
affected by a phonological rule. Such cases are familiar from the literature.
Examples of each are: (K IX) sh [i, k
h
i], (K IV) thank you [
m
kj].
7
(c) Prosodic phenomena: here the child treats the whole word, rather than its
segments, as a phonological unit. Two examples are: (T III) shoe [guti,
gutidi], (T IX) feet [
t
].
8
1.4. Phone classes and phone trees
One way to proceed in analyzing the initial consonants in the data would be to
group together all recurrences of the same phonetic symbols used in tran-
scription. Such a structureless listing is unilluminating for several reasons.
First, it simply does not show which different symbols might be regarded as
variants of one another, i.e., which sounds are in some structural sense related
and which are not. How similar must two sounds be for the analyst to decide
they belong together? Second, it does not allow the very likely possibility of
overlap in the phonetic value of different structural units or features. The phone
represented by a given phonetic symbol may be a production sometimes of one
phonological unit, sometimes of another. Finally, this procedure offers no
satisfactory way to relate the phones of one session with those of another
session. If one speech sound has changed sufciently between one session
and the next to be reported with a different symbol, how does the analyst
recognize this fact? Or if a child has nine phones (i.e., different phonetic
symbols) at one session, and twelve at the next, how is one to relate the two
systems?
What is needed is a way to determine which phones belong together or
correspond to one another, and the most obvious way is to use the word as the
framework for phone identication and classication. This is hardly a new idea,
since it is implicit in much of the phonological analysis of child language, but it
seems never to be made explicit (thus Francescato 1968 criticizes Jakobson and
others for not making explicit use of the word, although he himself does not
offer analysis of this kind).
By using the word as the basis of comparison, it is possible to establish the
notion of correspondence or corresponding phones, similar to the notion of
sound correspondence in comparative linguistics. For the purposes of our study,
in which we are dealing only with initial consonants, we may dene corre-
sponding phones essentially as any two consonants which begin different
utterances of the same word, whether at a single session or different sessions.
This denition must be modied to exclude instances of omission or assimila-
tion which may put non-corresponding phones in initial position.
The procedures employed in our analysis were as follows. For each session,
all the renditions of a given word were grouped together, and all variants of the
initial consonants in those renditions were noted. Then all words beginning with
the same phone or set of variant phones were put together. The set of initial-
consonant variants of each of these groups of words constitutes a phone class,
and is represented by the appropriate phonetic symbols in a box, or between
vertical lines.
9
Thus a phone class |d ~ t
h
| consists of the initial consonants of all
of those words whose initial-consonant sound varied between [d] and [t
h
]. All
the phone classes of one child at one session were represented by boxes in a
horizontal row, arranged roughly in order of place of articulation. Thus a child
might show three phone classes of initial consonants at a particular session:
jp~bj jmj jt~dj
After this, phone classes in different sessions were constructed according to
the occurrences of the same word. With each session making up a horizontal
level, solid vertical lines were drawn between successive phone classes if they
contained the same word. If successive phone classes did not contain the same
word but were related to phone classes which did, dotted lines were drawn
connecting them. For example, in Ts |m| class:
jmj mama
.
.
.
jmj milk
j
jmj milk; mama
In addition, and especially in the case of K, dotted lines were used to connect
phone classes which were each well-motivated and were phonetically close or
identical, but shared no words in common (especially Ks |b~p| in IX to XII).
10
Diagrams of this kind which connect corresponding phone classes of suc-
cessive stages constitute phone trees. The phone trees constructed for T, K,
and H appear as Figures 4.1, 4.2, and 4.3. In each gure, the number in
parentheses to the right of each phone class indicates the number of words
belonging to that class.
Sometimes the phone classes are not as simple as described above. Thus the
phone class |b ~ ~ bw ~ p
h
~ ~ | in T VI contains the following words and
initial-consonant variations: baby [b ~ ], ball [b], blanket [b], book [b ~ ],
bounce [b], bye-bye [b ~ p
h
], paper [b ~ ]. One might reasonably make
several phone classes out of these words, perhaps separating those in which [b]
does not vary or varies only with from those in which variation is with a
fricative or voiceless stop. For our purposes, they have been grouped together in
opposition to the phone class |p
h
| in which the following words occur: pat, please,
pretty, purse all beginning only with aspirated [p]. The claim of this grouping is
that it is only accidental that some words in the |b| class were found with variation
of one sort, and some with another; but that it is not accidental that the words in the
|b| class are separate from those in the |p| class.
In fact, if we look at the corresponding classes in the next session, we nd the
following: baby [b ~ w ~ p], ball [b], bang [b], blanket [b], book [b], bounce [b],
box [b], bye-bye [b ~ ]; but paper [p
h
], pat [p
h
], purse [p]. From the data
listings, it can be seen that baby occurs seven times with an initial [b], once with
a [p
h
], and once with a [w]. Bye-bye occurs three times with a [b] and once with a
[]. Hence it seems justiable to group them with other [b] words, and again it
seems that the important split is between the [b] words and the [p] words.
The notion phone class here is similar to the notion phoneme of
American structuralism, in that it refers to a class of phonetically similar speech
sounds believed to contrast with other such classes, as shown by lexical
identications. The determination of the phone classes of a particular childs
speech is made by methods similar to linguists procedures of elicitation and
phonemic analysis, but largely without the benet of minimal pairs and speak-
ers judgments. The purpose of the exercise (as ultimately for phonemic analy-
sis as well?) is to locate valid behavioral units.
In general, an attempt was made to distinguish as few phone classes as
possible, so that any error would be in the direction of underdifferentiation.
Consider the word dog in T IVIII. It is included in phone classes with some
variation even though the word dog itself is consistently produced with an initial
d. By Session VIII, however, dog seems to belong to a phone class by itself, and
perhaps it should have been separated all along.
Even with the policy of minimal differentiation, it may happen that phone
classes are separated unjustly. Consider Ts two classes |
t
s ~ s ~ | and | ~ ~ d|
in Session V. Although the regular criteria require their separation during that
one session, the fact that they are joined in the sessions before and after
suggests that the criteria are misleading in this case. A similar example is
the separation of |d ~ t
h
| and |t
h
| classes in Sessions VIII and IX. However,
I
IV
V
VI
VII
VIII
IX
II
III
b~d
b
p
h
p
bp
h
p
bbwp
h

bp
h
w bw
p
h
p
h b
p
h
p
b
b
(1)
(1)
(2)
(2)
(8)
(8)
(4)
(4)
(4)
(3)
(3)
(7)
mb
mmn
mb
(1)
(1)
(2)
(1)
w
w
w
w
w
w~b
(2)
(2)
(2)
(1)
(1)
(1)
f~p
h
f
(1)
(1)
d~t
h
~d
dd
ddj
dt
h
ddt
d
d
d
g
(1)
(1)
(2)
(2)
(3)
(4)
(2)
(2)
(1)
dt
h
g
d~d
t
h
t
h
t
h
tt jdZ
(2)
(2)
(2)
(1)
t
h
t
t
h
(3)
(3)
n
n
n
n j
(2)
(1)
(1)
(1)
(1)
s jsC
C
(3)
(3)
t Zj
t
h
sh
C
t
C
t
d jdZ
t dZ
(2)
(3)
(1)
t
ss Cd
ct
h
(2) (1)
(2)
(3)
Ct
h
II jh
(1)
dZd j
k
h
k
x
k
h
t
h
k
h
k
k
w
g
k
x
x
g
g
(1)
(1)
(1)
(1)
(1)
(2)
k
h
h
h?
h?
h?
h?
?
?
?
? h
h
(1)
(1)
(1)
(2)
(2) (1)
(1)
(4)
(4)
(2)
(3)
(2)
t
h
(2)
(4)
Figure 4.1. Phone trees of T
m
m
m
f
m
b
b
g
g
s
1
n
g
g
g
g
h
n
d
d
h
h
?
?
h?
?
h
n
d
dt
h
szZ
b
b
b
bp
d
hd
bp
b
dg
f
mn
m
b
m
ddt
bp
h
bp
h
t
h
t
t
h
t
h
t
h
k
h
k
h
k
h
pp
h
kh
k
h
p
h
b
I
II
III
IV
V
VI
VII
VIII
IX
X
XI
XII
(1)
(1) (1)
(1)
(4)
(1)
(2)
(2)
(2)
(2)
(2)
(2)
(2)
(2)
(5)
(1)
(4)
(4)
(2)
(2)
(1)
(1)
(1)
(1)
(1)
(1)
(1)
(1)
(1)
(1) (1)
(1)
(1)
(1)
(1)
(1)
(1)
(1) (1)
(1)
(1)
(3)
(3)
(1)
(1)
(1)
(1)
(1)
(1)
(2)
(1)
(1)
(1)
(3)
(3)
(3)
(3)
(2)
(2)
(2)
(2)
(1)
b
?
d
d
Figure 4.2. Phone trees of K
p
p
p
p
p
b m
m
mb
td
n l
pb
pb
b
b
b
b
b
t
t
t w
w
w
w
w m
t
t
t
t
d
d
j
j h
?
?
?
h
d
g
d
d
d
d
(1)
(1)
(1)
(1)
(1)
(1)
(1)
(1)
(1) (1)
(1)
(1)
(1)
(1)
0.10
0.11
1.00
1.1
1.2
1.3
1.4
1.5
(1)
(3)
(1)
(3)
(1) (3)
(2)
(3)
(5)
(1)
(2)
(2)
(5)
(2)
(2)
(2)
(7)
(8)
(8)
(2)
(2)
(4)
(3)
(3)
(2)
(2)
(6) (2)
(2)
(1)
(1)
(1) (1)
(1) (1)
(1)
(1)
(1)
(1)
(1)
(1) (1) (2)
(2)
(6)
d
p
p
p
j
Figure 4.3. Phone trees of H

since mergings are easy to see in the phone trees, such cases have been left as
originally analyzed. The lack of such phenomena in the Hdata is explainable by
the longer time periods contained in each stage.
2. Analysis
Given the organization of the observed data into phone classes and phone trees, we
should be able to compare them to our general expectations of the course of
phonological development as it has been previously reported. When we do this, we
nd certain surprising tendencies in our data. One inconsistency is the existence of
a high level of variation of word forms. The range of variability plus certain regular
forms of variation together make it difcult to make statements about either
phonological contrasts or unique underlying forms and systematic rules, so that
traditional forms of phonological analysis are not strictly applicable.
Another surprise is that many words seem to have more accurate renditions at
this early stage than would be expected. Furthermore, the child will sometimes
reduce an earlier, more accurate form as his learning proceeds. A nal and
related surprise is the seeming great selectivity of the child in deciding which
words he will try to produce.
All these aspects of our data point to one principle which puts them in the
proper perspective. Phonological development in children, like sound change in
language, takes place on several parameters, only one of which is the phonetic.
Here it is useful to consider the lexical parameter.
2.1. The lexical parameter in sound change
In general linguistic theory, synchronic or diachronic, the goal is to nd general-
izations of maximum validity; as a consequence, little attention is paid to differ-
ences in the behavior of individual words. Only the eld of linguistic geography,
with its slogan of Every word has its own history, has represented the opposite
tendency (Malkiel 1967). Similarly, in the studies of child language development,
both phonological and grammatical, the effort to nd generalizations has tended to
exclude the study of individual words. Even the large literature on child vocabu-
lary development is mostly concerned with estimating the extent of the total
lexicon at successive stages rather than with tracing the history of individual
words. One exception is Leopolds account of Hildegards vocabulary develop-
ment, which in many respects is one of the most informative (esp. I.14979).
European linguistic geographers, working with some of the same languages
in which the neogrammarians had shown regular sound correspondences, found
that isoglosses marking the extent of each sound change varied from word to
word; they thus showed the simple neogrammarian model of linguistic change
to be inadequate in spite of impressive evidence in its favor. The dialectologists
view and the neogrammarian model, each in several forms, tended to remain
side by side in linguistic theory without integration (see Bloomeld 1933,
chs. 19 and 20). Some recent models of linguistic change, such as that proposed
by Wang (1969), attempt to account for both sets of facts. Wang suggests that
sound change takes place on three parameters. On the phonetic parameter, the
phonetic manifestation of a sound change occurs abruptly at some point, goes
through a period of variation in which some words are found in two forms, and
nally approaches completion whereupon other forms may change abruptly
without going through a period of variation. On the lexical parameter, sound
change starts in a subset of the relevant words determined phonetically,
socially, or by other factors and spreads gradually through the lexicon to
other relevant forms. On the social parameter, sound change starts with some
group of people and spreads to others, or it begins in one speech style and
spreads to other styles in the same individual. (For discussion of a sound change
in a framework of this kind, see Ferguson 1971.)
Linguists have begun to acknowledge the phenomenon of variation which
accompanies linguistic change, but the lexical parameter has remained largely
ignored both by American structuralists and by present-day generativists, who
assume it is the phonological systemwhich changes regarding words as products
of the system, rather than as having a phonological existence of their own.
11
The parallels between sound change in language history and sound change in
child phonology development have often been drawn, and Jakobson (1941/
1968: 18) quotes Grammonts cogent remarks of decades earlier: By collecting
the linguistic peculiarities of a very large number of children, one could con-
struct a kind of grammar of changes which have appeared and can appear
somewhere in language. As an example of a parallel we could draw attention
to the phonologization by borrowing outlined by Jakobson 1949, using Russian
/f/ in illustration. The adult model for the childs language is the analog of the
source language in the Russian example, and the full acquisition of a phoneme
by the child parallels the nal addition of /f/ to the Russian inventory.
Here, however, we are suggesting a rather thorough-going application of
Wangs model of linguistic change to some of the phenomena of child language
acquisition (see also Hsieh 1972 and Moskowitz 1972, where aspects of Wangs
model are applied to child phonology). Of Wangs three parameters, the social
seems least relevant at this early age, since sound changes are taking place
within an individual, and children under two have very little control of different
speech registers (Berko Gleason 1973 suggests that their main register split is
speech vs. silence; see also Weeks 1971). The other two parameters, however,
are fully relevant; and we would like to argue that one cannot protably study
either the phonetic or the lexical parameter of child language acquisition with-
out taking account of the other.
2.2. Contrasts
From the earliest months of language development, some words assume a
relatively stable phonetic form, while others vary considerably. Variable
words are often those which have more advanced canonical forms or harder
sounds, so that variation can be explained as a kind of struggle with the word. In
other cases, however, relatively difcult words have early stable forms. Hs
word tick-tock, which she rst attempted at 0;11, assumes the form[ti-ta] from
1;0 on, and maintains that formsteadily for months. Similarly, the word Carolyn
becomes stable for H as [da-da]. Yet some relatively simple words show
variation, such as Hs ball (1;21;5) or her mama, which gave her two months
of trouble before stabilizing. Compare also the relative stability of Ts rock-rock
(VVIII) and book (VIIX) with the relative instability of baby (IIIX), daddy
(IIX), and milk (VIIIX).
One important consequence of the existence of variable forms emerges if an
attempt is made to determine phonological contrasts at these early stages.
Consider the contrasts m/b and m/n in Ks forms. From session I on, K has
something which one might call an m class, including words which start with
/m/ in the adult language, as well as occasional /n/ words (Nona, V; night-night,
VIII). This fact, along with the [m ~ n] variation at V, might lead us to think that
there is no m/n contrast. At the same time, there is a b class containing some
forms which start with /m/ in the adult model (moo, VII; mama, XII). In all this
time, there are no minimal pairs which would establish an m/b contrast in
phonemic terms: there are simply some forms which start with [b] and some
which start with [m]. From Session IV on, furthermore, there is an n class
containing only one word (no), which never varies with [m]. (In XII, another
word, nose, is added.) So again, although there may be no m/n contrast in the
usual sense, there are words which start with [m] or [m ~ n] and other words
which start only with [n]. Cruttenden 1970, in discussing a very similar exam-
ple, suggests: It may be that it is only possible at rst to make statements about
the existence of contrast between individual words. It does seem from our data
that it is often impossible to make well-motivated claims about phonological
contrasts in the usual sense at these early stages, as some might wish to do.
Often, variable forms and partial contrasts seem to correspond to a sound
change in progress, as we might expect fromour model of phonetic change. One
example is in Hs p and b classes from1;0 to 1; 3. By looking at the phone tree, it
can be seen that at 1;0 there are two well-motivated phone classes, |p| and |b|. At
1;1, one of the p words has begun to show variation; at 1;2, it is joined by one of
the b words. Finally, at 1;3, the third stage of phonetic change is observed: the
varying words join the |b| class along with a p word that has never shown
variation, as predicted by Wangs model. Other |p| words tend to drop out, while
new |b| words are learned. At this point, the sound change is complete, leaving
one residual form. The sound change that has occurred can be described as the
acquisition of a rule of voicing which states that initial consonants tend to be
voiced (see Ingram 1973: 59). For several months after 1;3, the |b| class will be
the dominant labial-stop class, whereas the |p| class will contain a small number
of residual or marginal forms. Note that a similar change takes place in Hs |t|
and |d| classes at 1;3 and 1;4, so that we may say that the voicing rule has spread
to the alveolar stops.
This sound change is really quite a strange step from a Jakobsonian point of
view. Rather than the learning of an opposition, this sound change results in a loss
of an earlier lexical, if not phonemic, opposition. The p word which starts the
change at 1;1 is papa, a word that begins with a p in the adult language. The data
shows that papa is rst used correctly with a p, then later becomes variable and
nally joins the group of words beginning with bs. In other words, papa has
become less like the model language in the process. That this case is not an
unusual one can be seen from an examination of what have been called pro-
gressive phonological idioms (Moskowitz 1971, 1972).
2.3. Phonological idioms
The clearest example of a progressive phonological idiom is pretty, Hs rst
permanent word, occurring in almost perfect phonetic form at 0;10. At a time
when other words are monosyllabic or have reduplicated consonants, pretty has
two syllables beginning with different consonants, and often has a successful
initial cluster. Only much later (1;9) does pretty become integrated into Hs
phonological system, taking the reduced form [piti], and even later (1;10)
becoming [bidi].
The opposite kind of phonological idiom, the regressive idiom, is not so
obvious at these early stages of acquisition. A regressive idiom is a word which
maintains an earlier formeven though a different formwould be expected, given
the childs phonological system. Regressive idioms are usually more reduced
than forms in the current system; but in cases like the voicing change, forms
which are better in terms of the adult phonology may be regressive idioms if
they maintain a contrast no longer in the system. Thus pretty, when it is not
affected by the voicing change, remains progressive in terms of its total form,
but becomes regressive in terms of its initial consonant.
Progressive idioms suggest that a childs perceptual and productive abilities
are more advanced than the phonological system seemingly exemplied by
most of his words; but the extent to which they are actually more advanced is
open to question. Since progressive idioms are by denition marginal or extra-
systemic, a linguist searching for generalizations might want to exclude them
from his data. However, determining which forms are progressive, apart from
the most obvious examples, implies a prior determination of a phonological
system already shown to be a difcult or questionable task at the earliest
stages of development.
One might assume that any word which changes from an earlier more
phonetically accurate form to a more reduced form has been a progressive
idiom until the time that it is reduced (ignoring the problems of determining
what word form is more phonetically accurate, since one sound in a word may
change in one way, and another sound in another way). Given this denition,
Hs papa is a progressive idiom when it is pronounced with a p, but it joins the
system when it is pronounced with a b. A generalizing approach would then
simply ignore the earlier form [papa] in describing the development of the
childs phonology.
It is hard to see the full consequences of this policy from our data, since the
time section is so short; but another example may make the dangers clear. The
word hello, which has just appeared in the form [l] at H 1;5, can be used. At
the same time that hello appears, H begins to use ls occasionally in other forms
(e.g., klingelingeling), and at 1;7 she adds alle to her vocabulary with an l.
Hello maintains the same form until 1;10, at which time it becomes [jojo] by a
liquid reduction rule and reduplication, making its form more primitive. At
about this same time, some other words participate in the liquid reduction
rule, while some show variation, and still others have ls. Data are given in
Table 4.2.
From this account, one could claim that hello and alle are progressive idioms
for several months, thus ignoring themand maintaining that l is acquired rst as /j/,
which later splits into /j/ and /l/. What actually seems to be happening, however,
is that two sound changes are occurring simultaneously. One, the acquisition of l,
or the combining of features for liquids, begins around 1;5 in the form [l].
Another, the liquid reduction rule (assuming it is a rule, and not a simple failure to
distinguish between l and j which is unlikely, given the earlier l forms), begins
somewhat later. The two rules compete for the same forms, and various words
come under themat different times. Since the existence of an l is a prerequisite for
the liquid reduction rule, it is not surprising that forms with ls sometimes precede
the more reduced forms. To call the early l forms idioms, and to omit them from
consideration in the phonology, gives a neat picture of successive phonological
systems, but omits important aspects of phonological development.
Just as Hs earlier forms with l are relevant to her phonological development,
so are her early forms with p (such as papa). Similarly, even if Ts and Ks
phonologies show great reductions of surface forms with the later acquisition of
rules, one can maintain that their earlier and more phonetically accurate forms
are part and parcel of the childrens phonological development.
Table 4.2. Development of the lateral /l/ in Hs speech during the second year
hello alle bottle lie Loch Loscher
1;5 l
1;6 ba:i
1;7 al ba:i
1;8 aj baiu
1;9 balu
1;10 jojo al baju lok/jok
1;11 jojo balu jai loko/joke
2.4. Saliency rules and avoidance rules
There is another feature of the T and K data which would not be predicted by a
Jakobsonian view of phonological development. A glance at the phone-class
listings shows a strong tendency for each phone class to represent words
containing that sound in the adult model. Thus T simply does not attempt an
adult p word until VI, at which point she has four words beginning with [p
h
].
Similarly, adult alveolars generally appear as alveolar; and not many adult velar
words are attempted at all until velar consonants are being attempted. (H is a
slight exception to this rule, having several k words in her |t| class.) It also
seems to be true that, after a rule such as initial-consonant voicing becomes
active in the childs system producing, e.g., a general class of oral
labial consonants the child next takes both adult p and b words into that
class. Thus H acquires almost exclusively b words until 1;10 and 1;11, when a
phonemic voicing contrast begins to develop in her system; however, the few p
words that are acquired tend to have b forms (paper, Paul, pick, pocketbook,
push).
The great selectivity of the child in picking the words which he will attempt to
say is not usually noted in the literature (but see now Ferguson, Peizer, and
Weeks 1973; Ingram 1972). Authors have mentioned a general avoidance of
difcult sounds, multisyllabic words, or words with consonant clusters; but no
one has made it clear that, at an early stage in which a contrast is absent (e.g.,
only b sounds, no p sounds in a childs speech), the adult words chosen by the
child will be highly discriminatory (e.g., he will choose only b words to say).
The issue of phonologically determined selectivity in word acquisition and use,
even by adults, is interesting in regard to the notions of phonological structure
and phonological importance. There are probably different degrees of effort
with which an adult acquires new vocabulary items of different phonetic
shape; and adults may systematically even consciously avoid words difcult
to pronounce. In the childs acquisition process, however, this whole issue
seems centrally important and deserves systematic investigation.
Our data, then, seem to cast doubt on the Jakobsonian assumptions of (a)
strict separation between phonetic and phonological development, and of (b)
simultaneity in lexical and phonological parameters of the break between
prelanguage and language. The Jakobsonian position is that, at the very time
at which one nds the rst true words, one nds a very reduced phonological
system, and that successive splittings of those vowels and consonants even-
tually produce the adult phonological system. In terms of contrasts determined
by phonemic analysis, this account may be true. But in terms of the phonetic
shapes of words and the selective acquisition of words, we have seen that a
childs early words are often much more phonetically accurate than one would
expect, and that these progressive forms reveal processes of sound develop-
ment which remain hidden if a strict separation of phonetic and phonemic
development is assumed.
3. Discussion
The data and analysis provided in this chapter have many implications for
broader issues in linguistics, some of which have already been mentioned. In
this section, such implications will be discussed under the headings of universal
order of acquisition, individual differences, and phonological theory. The
hypothesis of a universal order of acquisition in phonology, rst advanced by
Jakobson, has proved to be stimulating and fruitful, and any attempt at theoret-
ical discussion of phonology acquisition must react to it. Individual differences
in language behavior have traditionally been of little interest to linguists; but if
their work is to have relevance for therapy and education, linguists must learn to
use their analytic tools for description and explanation of such differences.
Finally, it is our conviction that the study of child phonology is a major source of
insight for the development of phonological theory in general. Under all three
headings, as in several earlier paragraphs, our tendency is to phrase the dis-
cussion in terms of criticisms of Jakobson. But we want to make it clear that we
feel his theory is still the most detailed, explicit, and suggestive one available
(see Ferguson and Garnica 1973); for this reason, we use it as a starting point in
exploring the implications of our own work.
3.1. Universal order of acquisition
One of Jakobsons major claims is that there is a uniform order of sound
development which tends to occur in different children learning the same
language, and to the extent that phonological structures are similar in
children learning different languages. The order is held to be a result of
fundamental implicational laws which are equally reected in the distribution
of phoneme types among the languages of the world such that, e.g., a rare
sound in the worlds languages is acquired later by children learning a language
which has it. To what extent, then, do T, K, and H seem to follow the same path
of development, and how does it accord with Jakobsons order?
Even though the phone trees show lexical contrasts rather than the phonemic
contrasts that Jakobson spoke of, the development of the three children is quite
similar, and follows many of Jakobsons predictions. Many of the differences
that do exist can be explained by the fact that the rst fty words do not
constitute a natural unit of phonological development. In particular, K seems
to be further along in the beginning of the data studied than either Tor H; while
H, who takes much longer to develop fty words, is doing more systematizing
as she goes. All three children have labial and alveolar stops as their rst sounds,
with nasals and glides in these positions developing later, and fricatives even
later. All three have labial nasals before others, although dental nasals may be
more widespread in languages of the world; but this is acknowledged and
explained by Jakobson. For all three children, velar consonants begin develop-
ment much later than those of other areas.
The details of order, however, are not in exact agreement among the children.
K has a labial nasal long before a labial glide: her rst w, like her rst f, occurs
two sessions after the end of our data. T develops her |m| class just before her
|w|, although the |w| forms appear more stable. H, on the other hand, develops a
|w| form rst, although her |m| class appears to be more productive at 1;5. It is
simply not true, then, that all contrasts are absolutely ordered with respect to
each other; but it is true in general that the ordering predicted by Jakobson
seems to hold for the three cases. We note, however, that Jakobson has not given
appropriate attention to the acquisition of semivowels and /h/, which are
frequently acquired quite early (as in the cases of T, K, and H), and also often
serve as early substitutes for fricatives and liquids.
At least one striking similarity is not predicted at all by Jakobson. All three
children show a preference for voiced labial and alveolar stops, but voiceless
velars. The children show this preference both in the forms they produce and in
their choice of forms fromthe model language. Thus even H, who does not have
a velar stop class before 1;5, borrows several velar-stop words from the parent
language, all beginning with k. Although her |d| class is the productive alveolar
class, the k words are taken into the |t| class and constitute its only members,
aside from tick-tock. This tendency is mentioned explicitly by Leopold, con-
rmed by T and K, and is also supported by data recently reported by Olmsted
(75).
12
This point, which is probably related to the instability of voiced velars
observed by Joseph Greenberg (personal communication) deserves further
investigation.
3.2. Individual differences
Jakobsons concern with generalizations about order of acquisition leaves no
room for considering the nature of individual differences in phonological
development. Yet any careful comparison of different children learning the
same language shows differences in the individuals paths of development.
Some of the differences are doubtless to be accounted for in terms of different
input under different conditions, e.g., the accidents of use of different vocabu-
lary items, different attitudes on the part of parents, etc. Some of the differences
seem to rest on different strategies adopted by children in acquiring adult
phonology, whatever the ultimate source of such strategies may be (see
Ferguson et al. 1973). Such individual strategies include preferences for certain
sounds, sound classes, or features (favorite sounds); extensive use of redu-
plication; special markers for certain classes of words (e.g., nal -rs as a sign of
reduced polysyllables, cited in Menn 1971); preferences either for lexical
expansion or phonological differentiation at the expense of the other; and
persistent avoidance of particular problem sounds.
Certainly, there are differences in the way T, K, and H are approaching the
learning of phonology. T, for instance, unlike the other two, has sibilant
fricatives and affricates as favorite sounds. Words beginning or ending with
these sounds (e.g., ice, eyes, shoes, keys, cheese, and juice) are welcomed into
her vocabulary and are used often, with varying forms, so that the phone class
representing words which begin with these sounds is very complex. This group
of words might well represent a schema, in the sense of Waterson (1971); but T
does not seem to show the corresponding kind of clear production patterns of
such words that Waterson nds. For T, one might say that the sounds in this class
are, for a time, more important than the lexical contrast which she seems to be
developing at the end of the period.
H, on the other hand, seems to gain control over certain classes and then to
prefer to add new words to them. Classes which show this preference on Hs
phone tree are |b|, |m|, |d|, and ||. Ksimilarly seems to prefer adding words to her
|b ~ p|, |d|, and | ~ h ~ | classes, although the greatest additions to those classes
occur soon after the end of this study and are therefore not shown. T, on the other
hand, seems to prefer building up her velar-stop class more than the other two
children do.
Kseems to approach the voicing contrast in her labial consonants in a slightly
different way from the other two, although this may simply be caused by the
small samples collected during many sessions. She apparently follows an
avoidance strategy: rather than establishing a separate |p| class before making
|b| the dominant class, she seems to avoid p words from the very beginning. She
also produces no p sound types until the time when p words are being taken into
the |b| class.
One nal difference of note is the fact that Hwhispers most of her words until
1;4, but consistently gives some words full voice. Da is the only voiced word at
1;0, and it continues to be voiced from then on. At certain points, Leopold
hypothesizes that H actually uses the whisper/full-voice contrast to separate
homophonous forms. Thus, during 1;4, fully voiced [dada] is Carolyn, while the
same form whispered is thank you. Similarly, [baba] with full voice is Papa,
while [baba] whispered is a reduplicated form of ball.
Tand K both occasionally whisper words, giving the impression that they are
not sure of those forms; and K (XII) learned whispering as a speech register
associated with a book about a sleeping baby (on the use of whisper as a register
by young children, see Weeks 1971). But there is nothing in the other two
children to compare with Hs consistent use of whispering, and it is hard to
assess the function of whispering in her speech development.
In sum, each of the three children is exhibiting a unique path of development,
with individual strategies and preferences and an idiosyncratic lexicon.
3.3. Phonological theory
Linguists approaching the study of child phonology have naturally tended to use
the theoretical constructs which have an important role in their general phono-
logical theories. Thus European and American structuralists have tended to look
for phonemes and distinctive features in child phonology, while generativists
tend to look for unique lexical representations and phonological rules which
operate on them. Our approach is to try to understand childrens phonological
development in itself so as to improve our phonological theory, even if this
requires new theoretical constructs for the latter.
The data and analysis of this study suggest a model of phonological develop-
ment and hence of phonology which is very different from those in vogue
among linguists. The model would de-emphasize the separation of phonetic and
phonemic development, but would maintain in some way the notion of con-
trast, i.e., the distinctive use of sound differences. It would emphasize individ-
ual variation in phonological development, but incorporate the notion of
universal phonetic tendencies which result from the physiology of the
human vocal tract and central nervous system, as constrained by universal
syntacticsemantic processes. It would emphasize the primacy of lexical
items in phonological development, but provide for a complex array of phono-
logical elements and relations including the notion of phonological rule in
the sense of a synchronic sound change determined by classes of sounds, lexical
items, or grammatical boundaries. In an oversimplied characterization, the
model would assert that children learn words from others, construct their own
phonologies, and gradually develop phonological awareness. The elaboration
of such a model is a major undertaking going far beyond the limits of this study,
but four key assumptions are worth stating here.
First, we assume that a phonic core of remembered lexical items and articu-
lations which produce them is the foundation of an individuals phonology, and
remains so throughout his entire linguistic lifetime. Lexical items of particular
phonetic shapes are acquired together with notions of appropriateness of use in
particular social frames; and changes in the phonic core are to be understood and
accounted for in terms of (at least) lexical, phonetic, and social parameters. Thus
we assume the primacy of lexical learning in phonological development, even
though it may be heavily overlaid or even largely replaced by phonologically
organized acquisition processes at later stages. Lexical primacy has many
implications which cannot be developed here, such as the need for assuming a
nonphonological, organized phonetic storage and the need to rethink our notion
of the phonology of a language.
Second, we assume that the child constructs phonological abstractions or
generalizations from his own phonic core and to some extent from new input;
i.e., he gradually imposes increasing phonological organization on his stock of
articulations and lexical representations. The kinds of organization may include
allophonic relationships, processes of assimilation, constraints on the phonetic
structure of morphemes, and all the complex regularities which linguists are
able to identify. We emphasize, however, that our approach requires the vali-
dation of phonological regularities by empirical investigation; i.e., a particular
relationship or process can be imputed to a particular child only when there is
direct or indirect behavioral evidence. For example, the fact that a child has two
lexical items which differ in a single phonetic segment is not in itself sufcient
justication for asserting that the child has the phonological contrast as such in
his repertory. Evidence is required from the childs verbal play, his response to
experimentally introduced nonsense material, or the like.
Third, we assume although this feature of the model does not follow from
the limited data of this study that phonological development includes the
gradual development of phonological awareness; i.e., the childs ability to deal
explicitly with phonological elements and relations is seen as a kind of self-
discovery of his phonological organization (see Ferguson and Slobin
1973:138ff.; Kavanagh and Mattingly 1972:13841, 3212, 3279).
Fourth, we assume that an adults ability to pronounce his language at any
point in his life constitutes a stage in his phonological development, and that this
ability exhibits the same kind of structure (although obviously differing in
detail) that is assumed in the childs phonological development. Thus any
satisfactory analysis of an adults pronunciation of his language requires the
specication of relevant lexical classes and the identication of relevant social
dimensions in addition to the description of phonetic elements and relations.
Further, since children have different inputs and utilize different strategies, the
gradual development of phonological organization and phonological awareness
may proceed by different routes and at different paces; hence adult phonologies
may differ from one another just as the lexical stocks of individuals may differ.
The individuals phonological idioms at any age are not mysterious aberra-
tions, but are manifestations of the natural course of phonological development.
In order to gain a deeper understanding of phonological development and
hence of phonology in general, some linguists at the present stage of the art
might be well advised to turn away from the fascination of writing rules of
maximum generality and conciseness for whole languages, and undertake
instead highly detailed analysis of the idiosyncratic paths which particular
children follow in learning to pronounce their languages.
notes
1. The difculty and the challenge are neatly summarized by Chomsky (1964: 356): It
seems that the attempt to write a grammar for a child raises all of the unsolved
problems of constructing a grammar for adult speech, multiplied by some rather large
factor . . . if anything far-reaching and real is to be discovered about the actual
grammar of the child, then rather devious kinds of observations of his performance,
his abilities, and his comprehension in many different kinds of circumstances will
have to be obtained, so that a variety of evidence may be brought to bear on the
attempt to determine what is in fact his underlying linguistic competence at each stage
of development.
2. The data collection and some of the analysis were carried out under the Stanford
University Child Phonology Project, which is supported by National Science
Foundation Grants GS 2320 and GS 30962. The data collection was planned by
Carol Molony, and was carried out by her, Carol Farwell, and Carolyn Johnson.
Transcriptions used in this study were done chiey by Farwell but some also by
Molony, and some of the questions were discussed in the Child Phonetics Workshop
conducted by Clara N. Bush.
3. In the case of K, 72 words were included in order to get 50 words which occurred
either spontaneously or more than once in an articulated form.
4. Leopolds comments about H are of special value because they are made in
comparison with his observations of her younger sister Karla, whose speech devel-
opment he followed in less detail some years later. He also makes comparisons with
previously published studies of child language development.
5. At a slightly later period, H had [] as a favorite syllable-nal sound, where it
represented any fricative of the adult model and was used very frequently.
6. For T and K, the roman numerals indicate session numbers.
7. In this case the pronunciation may be caused not by syllable deletion but by adult
renditions with suppressed initial syllable: pronunciations like [kju] and [mke] are
fairly common among adults for thank you and okay respectively, although they
were not observed from the adult in this study.
8. One example of prosodic treatment of a word was so radical that it was not included,
but it is interesting in itself. In K IV, the new word pen received the following forms
in this order in a one-half hour session: (1) [m
] (imitation), (2) [
] (imitation), (3)
[d
dn
], (4) [hin], (5) [
m
b], (6) [p
h
in], (7) [t
hn
t
h
n t
h
n ], (8) [ba
h
], (9) [d
h
au
N
], (10)
[bu]. K seems here to be trying to sort out the features of nasality, bilabial closure,
alveolar closure, and voicelessness.
9. The box was adopted as a convenient symbol, different from the brackets and slant
lines used in phonological transcriptions, and suggestive of the unity of the class; the
same symbol was used with a somewhat similar value in Jakobson 1949. For
typographic reasons, the box is replaced by vertical lines in the present text, although
boxes are used in the gures.
10. There is a danger that phone classes containing the same words may not actually
correspond because of an intervening reanalysis of a certain word at the input level
by the child. There is evidence that such reanalysis does take place: see Smiths
example of some and its compounds (1973:1456). Probably, however, such rean-
alysis is relatively infrequent, and in any case not directly related to the development
of the sound system.
11. Two important recent exceptions should be mentioned: Chen (1972), for a fuller
discussion of different approaches to sound change and an explication of Wangs
model as applied to the lexical parameter; and Labov (1972), for a typology of sound
change along social and lexical parameters. In the absence of any linguistically
motivated ordering principle, we assume that phonological change affects earliest
child language according to Labovs Model E, Random decomposition.
12. Editors note: the full reference is missing in the original paper.
References
Berko Gleason, J. (1973). Code switching in childrens language. In T. E. Moore (ed.),
Cognitive development and the acquisition of language, pp. 15967. New York:
Academic Press.
Bloomeld, L. (1933). Language. New York: Holt.
Chen, M. (1972). The time dimension: contribution toward a theory of sound change.
Foundations of Language, 8, 45798.
Chomsky, N. (1964). Formal discussion. In U. Bellugi and R. Brown (eds.), The
acquisition of language (Monographs of the Society of Research on Child
Development 29:1), 359. Lafayette, IN: Purdue University.
Cruttenden, A. (1970). A phonetic study of babbling. British Journal of Disorders of
Communication, 5, 11017.
Edwards, M. L. and Garnica, O. K. (1973). Patterns of variation in the repetition of
utterances by young children. MS.
Ferguson, C. A. (1971). Short a in Philadelphia English. Stanford Occasional Papers
in Linguistics, 1, 227. (Also in M. Estelle Smith (ed.), Studies in linguistics in
honor of George L. Trager, pp. 25974. The Hague: Mouton, 1973.)
Ferguson, C. A. and Garnica, O. K. (1973). Theories of phonological development. In
E. H. and E. Lenneberg (eds.), Foundations of language development, pp. 15380.
New York: Academic Press.
Ferguson, C. A., Peizer, D. B., and Weeks, T. E. (1973). Model-and-replica phonological
grammar of a childs rst words. Lingua, 31, 3565.
Ferguson, C. A. and Slobin, D. I. (eds.) (1973). Studies of child language development.
New York: Holt, Rinehart & Winston.
Francescato, G. (1968). On the role of the word in rst language acquisition. Lingua 21,
14453.
Hsieh, H-I. (1972). Lexical diffusion: evidence from child language acquisition. Glossa,
6, 89104.
Ingram, D. (1972). Phonological analysis of a developmentally aphasic child. Mimeo,
Institute for Childhood Aphasia, Stanford University.
(1973). Phonological rules in young children. Journal of Child Language, 1, 4964.
Jakobson, R. (1949). Principes de phonologie historique. Appendix to Principes de
phonologie, by N. S. Troubetzkoy. Paris: Klincksieck. (Originally presented at the
Runion Phonologique Internationale, Prague, 1930 and published in German,
Travaux du cercle linguistique de Prague, 4 (1931). Also in R. Jakobson,
Selected writings I, 20220, 1962.)
(1941/1968). Child language, aphasia and phonological universals, trans. A. R. Keiler.
The Hague: Mouton. (Originally published as Kindersprache, Aphasie and allge-
meine Lautgesetze. Uppsala: Almqvist & Wiksell, 1941.)
Johnson, C. E. and Bush, C. N. (1972). A note on transcribing the speech of young
children. Papers and reports on child language development, Stanford University,
3, 95100.
Kavanagh, J. F. and Mattingly, I. G. (eds.) (1972). Language by ear and eye. Cambridge,
MA: MIT Press.
Labov, W. (1972). The internal evolution of linguistic rules. In R. P. Stockwell and
R. K. S. Macaulay (eds.), Linguistic change and generative theory, pp. 10171.
Bloomington: Indiana University Press.
Leopold, W. F. (193949). Speech development of a bilingual child, 4 vols. Evanston, IL:
Northwestern University Press.
McCurry, W. H. and Irwin, O. C. (1953). A study of word approximations in the sponta-
neous speech of infants. Journal of Speech and Hearing Disorders, 18, 1339.
Malkiel, Y. (1967). Each word has a history of its own. Glossa, 1, 13749.
Menn, L. (1971). Phonotactic rules in beginning speech. Lingua, 26, 22551.
Moskowitz, A. I. (1971). Acquisition of phonology. Dissertation, University of
California, Berkeley.
(1972). Idiomatic phonology and phonological change. Mimeo, University of
California, Los Angeles.
Olmsted, D. L. (1971). Out of the mouth of babes. The Hague: Mouton.
Smith, N. V. (1973). The acquisition of phonology. Cambridge University Press.
Templin, M. C. (1947). Spontaneous versus imitated verbalization in testing articulation
in pre-school children. Journal of Speech Disorders, 12, 293300.
Wang, W. S-Y. (1969). Competing changes as a cause of residue. Language, 45, 925.
Weeks, T. (1971). Speech registers in young children. Child Development, 42, 111931.
Appendix 1: Primary data from T, K, and H
Data from T
Session I: 34 utterances, 4 words, 5 imitations
daddy (7) d
j, d
(2), d d
(2), dt, d
1
dog (5) d (5)
hi (20) h (2), ha (12),
h
a (3), a (3)
see (2) h
i
,
i
Session II: 30 utterances, 5 words, 8 imitations
baby (2) be
(imit.), dd
i
(imit.)
dog (4) d (2), d (2)
hi (14) a (5), ha (4), h (2), a, ha
i
, h
thank you (6) k
h
ju, k
x
ju
t
, dkk (imit.), td (imit.),
dd (imit.), dnk (imit.)
asking word (4) hlji, ljije, lij, le
Session III: 27 utterances, 9 words, 7 imitations
baby (1) de
i
bye-bye (1) bb (imit.)
daddy (1)
(3)
dog (2)
ddi
d
imit:; d
imit:
g
dd; d
; d
d
hi (4) ha (3), a
mama (8) mm (5), mmb, mnm, mm
shoe (2) gut, gutdi
thank you (3) djuk
ha
, dt
h
, d
d
tea (2) t
h
i (2)
Session IV: 26 utterances, 10 words, 6 imitations
baby (4) beb (2), bebe, p/be
ball (2) p
h
(imit), b
bye-bye (1) d b
daddy (2) dt
h
i, dd
duck (1) gk
h
(imit.)
hi (4) ha (2), h, a (imit.)
see (2) :, i:
shoe(s) (5)
, t
h
u,
i
, ,
thank you (3) dt
h
, t
h
dju, n da-d
where (2) w,
Session V: 30 utterances, 12 words, 3 imitations
baby (2) be, beb
w
i
ball (3) be,
m
bi, b
h
cracker (1) t
h
di (imit.)
daddy (1) ddi
eye (3)
(1)
a (2), a
)
ha
hi (2) a, ha
rock (5) wkwk (2), wkwe, wx, bk
see (4)
t
si, i, si, s
i
shoes (3) i, d,
u
sit (1)
t
st
tea (2) t
h
i
thank you (2) d
h
di,
n
xju (imit.)
Session VI: 56 utterances, 25 words, 7 imitations
allgone (4) ag
h
o, awo, ok
h
, ok
h
u
baby (3) ei, bi, bii
ball (1) ba
blanket (1) bijbj (imit.)

book (2) q, b
bounce (3) b, b, bw (series)
bye-bye (2) b
I
b (imit.), p
h
di
cereal (1) u (imit.)
dog (4)
n
d (2), da, da
h
a
cheese (1) i
hi (1) ha
ice (3) a (2), a
night-night (1) n n (imit.)
no (3) nn, no, n
h
(imit.)
paper (2) et, b.du
pat (4) p
h
t (3), p
h
please (2) p
h
e (2)
pretty (1) pr hi
purse (2) p
h
e, p
h
rock (6) wkuk (2), ukwk (2), ukuk, wk

shoe (4)
I
u,
I
u, u, t
h
u
h
tea (2)
h
i, t
h
i
thank you (1) t
h
h
i
up (1) a
yeah (1) ij
h
Session VII: 132 utterances, 29 words, 25 imitations
allgone (4)
X
g, g (2) (imit.), axg (imit.)
baby (9) bebi (5), p
h
ebi (imit.), wepi (imit.), be, bibi
ball (1) b (imit.)
bang (1) b
blanket (2) ba, b
book (4) bk (2), bk, bx
bounce (6) b (3) (series), b b b
w
(series)
box (4)

b (2), b
h
,bk (all imit.)
bye-bye (4) bab (2), bp, w (all imit.)
chair (2)
h
(2)
cheese (6) , , sti, h, ti,
t
i
daddy (10) ddi (3), ddi (2), di, d
h
i (imit.), t
h
(imit), t
h
i (imit.), t
h
dog (1) d (imit.)

eyes (2) a (2) (imit.)
ice (2) a, a (imit.)
key (s) (21) k
h
i (8), k
w
i (2), gi (2), k
h
e (2), k
h
i (2), xis, k
x
i, kxi, k
x
i, k
h
it
h
i
milk (9) mk (1),
m
bk (2)
no (2) n
.
o
paper (1) p
h
e
pat (1) p
h
purse (1) pti (imit.)

)
rocking chair
rock (5) wkwk, ukuk, uxk, ukd, wkntu
shoe (6) u, t
h
u, juj, uj, u
i
, tuc
I
sit (5)
t
(3), (2)
tea (9) i (3), t
h
i
(3), tji, di, ti
thank you (1) t
h
t
h
tiger (3) t
h
t
h
u, tc , t
h
k
h
i
up (1) a
h
where (9) u (2), u
(2), w
(5)
Session VIII: 106 utterances, 30 words, 9 imitations
allgone (2) ak
h
ox, at
h
o (imit.)
baby (2) be bi (2)
ball (2) (2)
book (2) b
x
(limit.), b
bye-bye (1) bp
h
(imit.)
cat (2) k
h
h
, t
h
cheese (5) (2), :, d,

daddy (9) ddi (5), t
h
t
h
i (2), d d, t
h
t
h
dog (4) d
, do, do, do
eye (s) (3) a, ai, a

t
: (imit.)
ower (1) f::
ice (3) a (2), a
juice (7) du (2), u
, , dju, tut
, ti
key (3) k
h
i (3)
mama (6) mm (2), mam (2),
, mm
milk (6) mx (2), bx (2), m
e
!
x, bx
no (8) n (7),

n
out (8) ax (3), a
(2), ax, a
x, x
paper (1) p
h
ep
h
e (whisper, imit.)
pat (1) p
h
b (pat bunny)
purse (1) p
h
y (imit.)
rock (6)
u
wkwk (2), wk,
u
wx,
hu
wx,
u
wk
h
see (2) ci (2)
shoe(s) (2)
i
u,

u::
sit (5) (2), , , t
tea (4) t
h
i (3), t
h
i
thank you (4) t
h
t
h
, dt
h
a, dt
h
, t
h
ht
h
tiger (2) tki (2)

two (2) t
h
(2)
walk (1) uk (imit.)
Session IX: 76 utterances, 32 words, 13 imitations
allgone (4) k
x
o, avg
1
o
x
, lgo, lk
h
o
baby (3) bep
h
i (2), bebi
ball (1) ba (imit.)
blanket (3) bh
book (4) bvk, bk
h
u, bx, bk
h
box (2) bx
bye-bye (2) bb (2) (imit.)
Carol (l) k
h
o
cereal (2) siu, uwu
cheese (1) ti
cup (l) t
h
(imit.)
daddy (2) ddi, d
dog (1) g
eye(s) (1) a
t
s i (imit.)
feet (7) fe, p
h
ets,
t
, f, p
h
i, p
h
i
, t
hi (1) ha (imit.)
juice (5) dus , duc
o
u
(2), djuc (2)
milk (5) b, k
no (3) no (2), na
okay (1) k
h
e (imit.)
one (1) w
paper (3) p
h
e p
h
i (imit.)
pot-pot (1) p
h
p
h
a
pretty (3) p
h
i
, p
h
i, p
h
t
h
i (imit.)
purse (2) py, p
h
rock (3) b
h
b
h
, b, b (imit.)
shoe (5) d u, sju
(imit.), i
u
,
u (imit.)
u
sit (3) (2), .
i
thank you (1) t

h
axi
tiger (1) dak
x
i
two (2) t
h
, t
h
u
where (4) w
(3), w
Data from K
Session I: 33 utterances, 12 words, 6 imitations
allgone (3) k
h
o,
g , ga (imit.)
bear (1) bi
u
j (imit.)
book (1) b
w
ux
daddy (2) de
hI
,

dd (imit)
duck (1) d (imit.)
hi (1) ha
mine (1)
b
ma
monkey (1) m
k
bu (imit.)
pop (1) p
h
ap
h
see (9) di (4), di, i, z

1
, si, t
h
i
thank you (2) ddi, ddud
(imit.)
that/there (10) d (2), d (2), id
, d, d
, d, d
, t
Session II: 6 utterances, 3 words, 1 imitation
block (1) bok
h
dog (2) da, d (imit.)

hello (3) h
o
, h, dugo
Session III: 14 utterances, 7 words, 3 imitations
allgone (4) p
h
(2), gu, ap
daddy (2) d: du, ddiu
down (1) d
a
duck (2) d, gp
(imit.)
on (1)
(imit.)
off/up (2)
,
b
u
shoe (2) g (2)
Session IV: 27 utterances, 8 words, 3imitations
boom (1) bu
daddy (3) ddi
(3)
dog (4) dad
1
, d, d
w
, abd
Kimberly (1) s imu
me/mime (2) m
N
, m
no (2) , n.o
h
pen (12) m
(imit.), ~(imit.), d
dn
, hn,
m
b, p
h
n,
t
h
n (), b
h
, d
h
a
n
, bu
thank you (2) h
(imit.),
m
kj
Session V: 19 utterances, 11 words, 11 imitations
car (1) gk
h
dog (2) d (imit.), do
(imit.)
duck (2) dk
h
, d
gum (1) g
hot (4) ht
h
, h
h
, ht
h
, h (all imits.)
hot tea (1) hdi (imit.)
Max (1) m.~ (imit.)
Nona (3) mm
h
(imit.), nun
, mun
h
Satchiko (1) un (imit.)
there (2) d, d
turkey (1) t
hI
g (imit.)
Session VI: 6 utterances, 4 words, 3 imitations
balloon (2) bp
(imit.), b
b
box (1) b
t
(imit.)
hot (1)
h
d (imit.)
that (2)

d,
t
h
Session VII: 11 utterances, 6 words, 9 imitations
boy (1) b (imit.)
cow (2) g
h
(imit.), gj
lady (2) lad, ld (imit.)
moo (2) b (imit.), b
(imit.)
telephone (3) t
h
d, t
h
, tddd (all imit.)
thank you (1) x
h
(imit.)
Session VIII: 5 utterances, 4 words, 3 imitations
allgone (2) ul
, a
(both imits.)
night-night (1) m
(imit.)
off (1)
watch (1) bt:
h
Session IX: 15 utterances, 12 words, 7 imitations
allgone (1) aw
kh
bear (1) b
t
(imit.)
bee (1) bi:
bird (2) b
n
d, p
u
x (both imits.)
coat (1) k
h
k
hI
cookie (1) k
h
xju
M (2) , k
h
(both imits.)
girl (1) k
h
(imit.)
no (1) n
pumpkin (1) bg
n
(imit.)
tea (1) t
h
N
turn (off) (2) t
h
r, t
h
na
Session X: 15 utterances, 13 words, 3 imitations
all done (1)
.
ba
baby (1) p
h
e (imit.)
boot (1) b
h
bottle (1) badf
Its a toothbrush (1) is i (imit.)
Max (1) m
k
h
(imit.)
me (1) m
moo (1) mu:
on (2) al
, a:dit
h
put (1) p
h
u
see (1) i
that (1) d t
h
up/off (2) ,
Analyzed sentence
me see that m i et
h
Session XI: 11 utterances, 10 words, 7 imitations
balloon (2) bo
u
wu, bow
book (1) bk
h
bow (1) pvo
(imit.)
dog (1) gak
h
i (imit.)
driver (1) dd (imit.)
girl (1) g
x (imit.)
kitten (1) k
v
h
i
d
n
pencil (1) p
hI
s
n
(imit.)
recorder (1) fod
(imit.)
sock (1) gk
h
(imit.)
Session XII: 26 utterances, 19 words, 7 imitations
baby (1) p
h
eb
h
i
break (1) p
h
k
h
(imit.)
bye-bye (1) b:
t
ears (1) irw
x
eye (2) ai
h
, ha
i
(both imits.)
go (1) go
its a (1) its
key (1) k
h
e
h
mama (1) b
w
h
moo (1)

mu.
h
(imit.)
mouth (1) m
f
no (4) n
u
, no (2), n
nose (1) n
(imit.)
puppy (1)

bbi
i
shoe (2) i
u
(imit.), hu
l
that a (2)

d
h
h
(imit.), di
there (1)
h
il
tie (1) t
h
ai
l
woofwoof (2) f,
Analyzed sentences
that a woofwoof di
f
Its a baby itsa p
h
eb
h
i
Data from H
0;10 4 utterances, 2 words
pretty (1) prti (wh/vd)
there (3) di, dii, de:
pretty (1) priti (wh/vd)
there (2) d:
ticktock (1) tak (wh)
ball (1) ba (wh)
Blumen (1) bu (wh)
da (1) da:
Opa (a) pa (wh)
Papa (1) pa-pa (wh)
piep! (1) pi, pipi
pretty (1) pti (wh)
sch-sch (1) -
ticktock (2) ti-ta, ta, t-t (wh)
ball (1) ba (wh) pieks! (1) by
bimbam (1) bi: piep! (3) pi:, pi:p, pi pi
da (1) da pretty (2) priti, prti
Gertrude (2) d:da, d:di sch-sch (1) -
kiek! (1) ti Tante (2) da-da, di-d
kritze (1) tits (wh) ticktock (1) t-ta
Opa (1) pa-o (wh) Wauwau (1) wa wa (falsetto)
Papa (2) pa-pa, ba-ba (wh)
baby (1) bebi Papa (2) papa, baba
ball (2) b, p (wh) piep (1) pi pi
bimbam bi-ba pretty (0) (rare)
Carolyn (1) g-ga sch-sch (1) -
da (1) da ticktock t-ta
kritze (1) tits (wh) Wauwau (1) wa wa (falsetto)
moo (1) mu:
A-a (1) a-a kitty (2) di di, ti ti (affricated)
baby (1) be bi: kritze (1) tits
ball (2) ba: i, ba (wh) Mama (3) mama, maba, ma
bath (1) ba: Papa (1) baba
Bild (1) bi piep (1) bi bi
bye-bye (1) ba ba (wh) pretty (1) pti
Carolyn (1) dada sch-sch -
da (1) da thank you (1) da da (wh)
ja (1) ia ticktock t
t
i-t
t
a
kiek! (1) ti: Wauwau wu wu wu
A-a (1)
t
a
t
a kritze (1) tits
baby (1) be bi Mama (3) bama, maba, mama
ball (2) ba: i, ba:i (wh) Marion (1) mm
bath (1) ba: Papa (1) ba ba
bed (1) b peekaboo (1) bi
Bild (1) bi: piep! (1) pi pi (falsetto)
bye-bye (1) babai (wh/vd) pretty (3) pwiti, pti, pyiti (wh)
Carolyn (1) dada sch-sch -
da (1) da thank you (1) dada, dadai (wh)
da ist es (2) da:i, da:i ticktock (2) ti-ta, tik-tak (wh/vd)
down (1) da: up (1) ap
hot (1) ha (wh) Wauwau (1) wu wu
ja (1) ja (wh) yes (1) j
A-a (1) a a I see you (1) ai i
all (1) a: ja (1) ja ja ja ja
apple (2) apa, aba klingelingling (1) li li li
auto (3) ata, ada, aoda mama (1) mama
baby (1) bebi man (2) m, ma
ball (1) ba Marion (1) meme
bath (1) ba: mehr (1) me:
bed (1) b mitten (1) mi:
bitte (2) bit, biti naughty (1) nana (-like)
bye-bye (1) bai bai Papa (1) baba
brush (1) b night (1) a a
Carolyn (1) dada piep! (1) pi pi
da (1) da pretty (0) (rare)
da ist es (1) da: i: Rita (1) wi wi
down (2) da:, da: o sch-sch -
heiss (1) hai (wh) thank you (2) da dai, dada
hello (1) l there (1) d
highchair (1) aita ticktock (1) ti ta
hot (1) ha up (1) ap
I (1) ai Wauwau (1) wu wu
Appendix 2: Phone classes for T, K, and H
Phone classes for T
Session I
h ~ hi, see
d ~ d daddy, dog
Session II
h ~ ~ hi
d ~ d ~ t daddy, dog, thank you
b ~ d baby
k
h
~ k
x
thank you
l ~ lj ~ h asking word
Session III
h ~ hi
d ~ dj daddy, dog, thank you, baby
m ~ mn mama
g shoe
t
h
tea
b bye bye
Session IV
b ~ p
h
~ p baby, ball
d ~ t
h
thank you, daddy, bye-bye
h ~ hi
w where
~ ~ t
h
shoe, see
g duck
Session V
b ball, baby
t
s ~ s ~ see, sit
~ ~ d shoe
w ~ b rock
t
h
tea, cracker
d daddy, thank you
h ~ ~ hi, eye
Session VI
w rockrock
b ~ ~ p
h
~ bw ~ ~ baby, bounce, bye-bye, paper, blanket, ball, book
p
h
please, purse, pretty, pat
d dog
t
h
tea, thank you
j yeah
h hi
n no, night-night
~ ~ ~ t
h
cereal, cheese, shoe
~ up, allgone, ice
Session VII
w rock, where
b ~ p
h
~ w ~ ~ bu baby, ball, box, bye-bye, book, bang,
blanket, bounce
p
h
~ p pursey, pat, paper
k
h
~ k ~ k
w
~ kx ~ x ~ g key
d ~ t
h
~ d dog, daddy
t
h
~ t ~ d tea, thank you, tiger, chair
~ t ~ ~ j ~ t
h
~ s ~ h ~ ~ t ~ dj shoe, sit, cheese
m ~ b milk
n no
~ eyes, ice, allgone, up
Session VIII
t
h
~ d ~ d thank you, daddy
t
h
~ t two, tea, tiger
d dog
k
h
~ t
h
key, cat
w rock, walk
b ~ ball, book, bye-bye, baby
p
h
pat, purse, paper
n no
m ~ b mama, milk
~ see, shoes, sit
~ dj ~ d ~ t ~ d ~ cheese, juice
f ower
~ eyes, ice, out
Session IX
p
h
~ p paper, purse, pot-pot, pretty
f ~ p
h
feet
b
x
bo, bye-bye, baby, book, rock, ball, blanket
k
h
Carol, okay
g dog
h hi
n no
b ~ milk
t
h
thank you, two, cup
d ~ d tiger, daddy
w where, one
t cheese
d ~ dj juice
~ allgone, eyes
~ ~ ~ dj s shoe, cereal, sit
Phone classes for K
Session I
p
h
pop
b book, bear
m ~
b
m monkey, my
d ~ d ~ t thank you, duck, daddy, pointing word
d ~ t
h
~ s ~ z ~ see
h hi
all gone
Session II
b block
d dog
h ~ d hello
Session III
d ~ (g) ~

d duck, down, daddy
g shoe
~ on, up/off, allgone
Session IV
b boom
m me/my
d ~ dog, daddy
n ~ no
s Kimberly
k thank you
Session V
m ~ n Nona, Max
d dog, duck, there
t
h
turkey
Satchiko
g car, gum
h hot, hot tea
Session VI
b balloon, box
d ~ that
h hot
Session VII
b boy, moo
t
h
~ t telephone
l lady
g cow
thank you
Session VIII
b watch
m night-night
off, allgone
Session IX
b ~ p bird, bear, bee, pumpkin
t
h
tea, turn off
n no
~ k sh
k
h
cookie, coat, girl
allgone
Session X
b ~ p put, baby, boot, bottle
m moo, Max, me
d that
see
~ all done, on, up, off
Session XI
b book, balloon
p ~ p
h
bow, pencil
f recorder
d driver
g dog, girl, sock
k
h
kitten
Session XII
b ~ p baby, puppy, bye-bye, break, mama
m moo, mouth
d ~ d there, that
t
h
tie
n no, nose
h ~ shoe
g go
k
h
key
h ~ ~ ears, eyes, itsa
f ~ woof-woof
Phone classes for H
0;10 p pretty
d there
0;11 p pretty
t ticktock
d there
1;0 p Opa, Papa, piep, pretty
b ball, Blumen
t ticktock
d da
sch-sch
1;1 p Opa, piep, pretty
p ~ b Papa
b ball, bimbam, pieks!
w wauwau
t ticktock, kiek!, kritze
d da, Gertrude, Tante
sch-sch
1;2 p piep, pretty
p ~ b ball, Papa
b baby, bimbam
m moo
w wauwau
t ticktock, kritze
d da
sch-sch
g Carolyn
1;3 p pretty
b baby, ball, bath, Bild, bye-bye, Papa, piep
m mama
w wauwau
t ticktock, kritze, kiek
t ~ d kitty
d Carolyn, da, thank you
j ja
sch-sch
A-a
1;4 p pretty, piep
b baby, ball, bath, bed, Bild, bye-bye, Papa, peekaboo
m ~ b mama, Marion
w wauwau
t kritze, ticktock
d Carolyn, da, da ist es, down, thank you
j yes, ja
sch-sch
h hot
A-a, up
1;5 p piep, pretty
b baby, ball, bath, bed, bitte, brush, bye-bye, papa
m mam, man, Marion, mehr, mitten
w wau wau, Rita
t ticktock
d Carolyn, da, da ist es, down, thank you, there
n ~ naughty, night-night
l klingelingeling
j ja
sch-sch
highchair
h hot, heiss
A-a, all, apple, auto, hello, I, I see you, up
Appendix 3: Word Index for T, K, and H
A-a H-1; 3, 1;4, 1;5
all H-1;5
all gone T-VI, VII, VIII, IX; K-I, III, VIII, IX, X
apple H-1;5
auto H-1;5
baby T-II, III, IV, V, VI, VII, VIII, IX; K-X, XII; H-1;2, 1;3, 1;4, 1;5
ball T-IV, V, VI, VII, VIII, IX; H-1;0, 1;1, 1;2, 1;3, 1;4, 1;5
balloon K-VI, XI
bang T-VII
bath H-1;3, 1;4, 1;5
bear K-I, IX
bed H-1; 4, 1;5
bee K-IX
Bild H-1;3, 1;4
bimbam H-1;1, 1;2
bird K-IX
bitte H-1;5
blanket T-VI, VII, IX
block K-II
Blumen H-1;0
book T-VI, VII, VIII, IX, K-I, XI
boom K-IV
boot K-X
bottle K-X
bounce T-VI, VII
bow K-XI
box T-VII, IX; K-VI
boy K-VII
break K-XII
brush H-1;5
bye-bye T-IV, VI, VII, VIII, IX; K-XII; H-1;3, 1;4, 1;5
car K-V
Carol(yn) T-IX; H-1;2, 1;3, 1;4, 1;5
cat T-VIII
cereal T-VI, IX
chair T-VII
cheese T-VI, VII, VIII, IX
cookie K-IX
coat K-IX
cow K-VII
cracker T-V
cup T-IX
da H-1;0, 1;1, 1;2, 1;3, 1;4, 1;5
da ist es H-1;4, 1;5
daddy T-I, III, IV, V, VII, VIII, IX; K-I, III, IV
dog T-I, II, III, VI, VII, VIII, IX; K-II, IV, V, XI
down K-III; H-1;4, 1;5
driver K-XI
duck T-IC; K-I, III, V
ears K-XII
eyes T-V, VII, VIII, IX; K-XII
feet T-IX
sh K-IX
ower T-VIII
Gertrude H-1;1
girl K-IX, XI
go K-XII
gum K-V
heiss H-1;5
hello K-II; H-1;5
hi T-I, II, III, IV, V, VI, IX; K-I
high chair H-I;5
hot K-V, VI; H-I;4, 1;5
hot tea K-V
I H-1;5
ice T-VI, VII, VIII
I see you H-I;5
ja H-1;3, 1;4, 1;5
juice T-VIII, IX
key T-VII, VIII; K-XII
kiek! H-1;1, 1;3
Kimberley K-IV
kitten, kitty K-XI; H-1;3
klingelingeling H-1;5
kritze H-1;1, 1;2, 1;3, 1;4
lady K-VII
mama T-III, VIII; K-XII; H-1;3, 1;4, 1;5
Mann H-1;5
Marion H-1;4, 1;5
Max K-V, X
mehr H-1;5
milk T-VII, VIII, IX
mine (me) K-I, IV
mitten H-1;5
monkey K-I
moo K-VII, X, XII; H-1;2
mouth K-XII
naughty H-1;5
night-night T-VI; K-VIII; H-1;5
no T-VI, VII, VIII, IX; K-IV, IX, XII
Nona K-V
nose K-XII
off K-VIII
okay T-IX
on K-III, X
one T-IX
Opa H-1;0, 1;1
out T-VIII
Papa H-1;0, 1;1, 1;2, 1;3, 1;4, 1;5
paper T-VI, VII, VIII, IX
pat T-VI, VII, VIII
peekaboo H-l;4
pen K-IV
pencil K-XI
pieks! H-l;l
piep H-1;0, 1;1, 1;2, 1;3, 1;4, 1;5
please T-VI
pop K-I
pot-pot T-IX
pretty T-VI, IX; H-0;10, 0;11, 1;0, 1;1, 1;2, 1;3, 1;4, l;5
pumpkin K-IX
puppy K-XII
purse T-VI, VII, VIII, IX
put K-X
recorder K-XI
Rita H-l;5
rock (rock) T-V, VI, VII, VIII, IX
Satchiko K-V
see T-I, IV, V, VIII; K-I
sch-sch H-1;0, 1;1, 1;2, 1;3, 1;4, 1;5
shoe T-III, IV, V, VI, VII, VIII, IX; K-III, XII
sit T-V, VII, VIII, IX
sock K-Xl
Tante H-l;l
tea T-III, V, VI, VII, VIII; K-IX
telephone K-VII
thank you (danke) T-II, III, IV, V, VI, VII, VIII, IX; K-I, IV,
VII; H-1;3, 1;4, 1;5
that K-V, VI, XII
there K-V, VI, XII; H-0;10, 0;11, 1;5
ticktock H-0;11, 1;0, 1;1, 1;2, 1;3,1;4, 1;5
tie K-XII
tiger T-VII,VIII, IX
turkey K-V
turn (off) K-IX
two T-VIII, IX
up T-VI, VII; K-III, X (?off); H-l ;4, 1 ;5
walk T-VIII
watch K-VIII
wauwau H-1;1, 1;2, 1;3, 1;4, 1;5
where T-IV, VII, IX
woof-woof K-XII
yeah T-VI
yes H-l;4
5 Developmental reorganization of phonology:
a hierarchy of basic units of acquisition
Marlys A. Macken
1. Introduction
This chapter describes the acquisition of the consonant system by one child
acquiring Mexican Spanish as her native language. During the earliest stages
(from 1;7 to 2;1 years of age), the data from this child referred to here as Si
showed several phenomena that could best be accounted for by assuming a
central role for the word as the basic unit being acquired. Words were, for Si,
prosodic units, each being selected for a particular output form on the basis of
the component consonants and each processed in exible ways to achieve
preferred output patterns. The evidence for the centrality of words and word
pattern in Sis early development will be the major focus of this chapter.
During the later stages (from 2;2 to 2;5), most of the evidence for words and
word patterns has disappeared, and Sis phonological system during this period
can be described adequately in terms of phonemic contrasts and the more
traditional phonological rules.
Thus, the picture of phonology acquisition that emerges from these data is
one in which there are at least two and possibly three basic units the word,
the phoneme, and possibly the feature that gure signicantly in the
developmental process. It appears, moreover, that the word is more important in
the earliest stages, and that in the later stages, the phoneme replaces the word as
the basic structural unit of the phonological system. In the nal section of the
chapter, the relevance of these data for some aspects of a general model of
phonology acquisition is discussed.
2. Data collection and analysis
Si participated in a longitudinal study designed to investigate the acquisition of
consonants in monolingual Mexican-Spanish-speaking children. Approximately
This research is part of the activities of the Stanford Child Phonology Project and was supported by
a National Science Foundation Grant (#SOC 740316 AOl). I would like to thank: Charles
A. Ferguson for support during all phases of the research; Debby Ohsiek, Maria Rodriguez, and
Ana Ortiz for assistance with the data collection; and Carol Stoel-Gammon and Lise Menn for
helpful comments on a draft of the paper. A preliminary version of this paper appears in the
Stanford Papers and Reports on Child Language Development (December 1977), 14, l36.
133
once a week for a ten-month period, she was recorded for fteen to thirty minutes,
while interacting with the experimenter (E), a native speaker of Mexican Spanish.
Recording was done on a Uher 4000 tape recorder with a Sony Electret micro-
phone (attached to a soft cloth vest which the child wore). Stimulus materials
included picture books, index cards picturing common objects, and small toy
objects. When not being recorded, she remained in a playroom. Notes were taken
in the playroomby a research assistant. These notes, which included words which
the child used, were used subsequently by the Eduring recording sessions in order
to elicit as many of the childs vocabulary items as possible. Word lists were
collected from the parents every two weeks and were also used by the E during
recording sessions.
During the week following each session, transcriptions were made of all
tapes by two transcribers working independently and using Revox A77 tape
recorders with Super St-Pro B-V headphones. Two procedures were used to
combine the individual transcriptions of each utterance into a nal one (Macken
1978). The transcription system used is that of the International Phonetic
Association, with a supplemental symbology developed by the Stanford Child
Phonetics Workshop (Bush et al. 1973). Consonants were transcribed narrowly
and vowels somewhat more broadly. In this chapter, Sis phonetic segments and
phonetic sequences will be given in square brackets, [. . .], the typical notation
for such segments in descriptions of adult speech. Her phonemes and phonemic
sequences will be enclosed however by vertical straight lines, | . . . | a nota-
tional device used by Smith (1973). The more usual slant lines, /. . ./, will be
reserved for phonemes or phonemic sequences of adult Spanish only. Although
Sis data were transcribed quite narrowly, examples of phonetic sequences when
used in the following text will be given in no narrower a transcription than
necessary.
3. The subject
Si was 1;7 at the beginning of the study and 2;5 at the end. She is the youngest of
seven children; she has three brothers (ages 17, 15, and 6 years at the beginning
of the study) and three sisters (13, 11, and 8). She was born in Redwood City,
California, to parents who had moved there from Michoacn, and the family
speaks only Spanish in the home. However, both parents work and speak
English at their places of employment, and all of her siblings attend English-
speaking schools; thus, Si may have been exposed to some English. During the
rst several months of the study, Si was cared for by a monolingual Spanish-
speaking neighbor. She subsequently was enrolled in a day-care program for
Spanish-speaking children which was staffed by native speakers of Spanish;
nearly all the children enrolled were monolingual Spanish-speaking children.
During the period (prior to and) from1;7 to 2;2.15, Sis environment was almost
134 Marlys A. Macken
exclusively Spanish-speaking, and her exposure to English is presumed to have
been minimal.
When Si was 2;2.15, she was transferred to another day-care program, in
which there were many monolingual English-speaking children. The programs
staff reported to us that Si began learning English very quickly. During the
period from 2;2.15 to 2;5, she used several English words during our sessions:
apple, phone, shoe(s), car, puppy, and mommy home were used frequently; hi
and monkey were used once; and donkey, sun, and berry are possible English
glosses for utterances produced in ambiguous contexts.
During her thirty-ve sessions over the ten months, Si spontaneously pro-
duced nearly 200 recognizable words. The corpus of spontaneously produced
speech contains 2,536 tokens and accounts for 51 percent of her total recogniz-
able speech. In imitation, she produced 2,463 tokens of both the same words and
many other words that she never produced spontaneously. From the very
beginning, Si imitated more than any other subject in the study and more than
any of the children reported in Bloom, Hood, and Lightbown (1975). During a
preliminary analysis, it became clear that the relationship between spontaneous
and imitated forms was a complicated issue: imitations were neither always in
advance of spontaneous forms nor necessarily predictive of spontaneous devel-
opment. These two types of productions were analyzed separately. In this
chapter, only the analysis of spontaneously produced speech will be reported.
There were several aspects of Sis speech that were particularly striking and
are relevant to her phonological development: her use of a pre-utterance vowel;
the diminished phonetic accuracy that occurred when she produced words in
phrases; her use of consolidated unit phrases and routines; and her mis-
perceptions. The rst three characteristics of her speech, together with several
other factors, contributed to the enormous amount of phonetic variation which
was the hallmark of her productions, which set her distinctly apart from the
other subjects, and which presented the rst obstacle to the phonological
analysis. All four aspects of her production are relevant to general features of
her phonology and, in particular, to the role that the word as a phonological
unit played in her development.
From1;7 to 2;5, 19 percent of all spontaneous utterances and 20 percent of all
imitated utterances were produced with a prexed ller that was almost
always a neutralized vowel, but that sometimes was a syllabic consonant or a
syllable of the form CV; in rare cases, she added a ller (either a vowel or a
syllable) to the ends of utterances. This pre-utterance vowel occurred freely,
and no restriction of its occurrence to particular words, word classes or phonetic
environments could be discerned. Although there were several changes in the
frequency with which this vocalic segment occurred, the uniformity of its
appearance over the ten-month period indicates that it was a general character-
istic of her speech, perhaps an initiation-of-speech phenomenon.
Obviously, the pre-utterance vowels presented problems for analysis.
Particular occurrences of it could variously be interpreted as an article, a verb
Developmental reorganization of phonology 135
form, the unstressed initial syllable of a particular word, etc. In any particular
case, any of these interpretations would have consequences for the phonological
analysis. Since the interpretation of these vocalic segments was problematic, it
was decided to treat all occurrences as a single phenomenon. Thus, they were
not counted as independent words in the session tabulations for words and
tokens, and they were ignored in the phonological analysis. A possible conse-
quence of this decision is the underestimation of Sis abilities. For example, she
was not credited with the acquisition of some articles and the verb forms es and
est until late in the study when clear evidence obtained. Similarly, she was not
credited with the ability to produce three-syllable words until three-syllable
words of the form [CVCVCV] were produced; words with an initial unstressed
vowel syllable (like araa spider) remained difcult to interpret. If in fact the
pre-utterance vowel represents an attempt on Sis part to designate semantic
properties such as denite/indenite reference, then it is the case that this
attempt preceded any other evidence of such semantic knowledge by many
months. Alternatively, its use could merely represent an attempt to replicate the
length and form (without content) of adult utterances. In any event, the use of
these vocalic segments is consistent with her general tendency to embed a word
in a longer carrier. In fact during the earliest sessions, Si frequently produced
long, basically unanalyzable utterances in which only one recognizable word
occurred.
When Si began to combine two recognizable words, the phonetic accuracy of
both words decreased. In many of these two-word sentences, she consolidated
the two (polysyllabic) source words into one, smaller (e.g., two-syllable) form.
In the resulting two-syllable sentence, only one syllable of each word was
produced; in contrast, when these words were produced in isolation, they were
rarely, if ever, produced as a single syllable only. Possibly similar to this was the
way in which she consolidated learned routines: for example, qu es? what
is? |kes|.
1
It seems that Sis willingness to use large units (e.g., long words,
sentences, and routines) while at the same time having only limited production
abilities (and/or limited semantic knowledge) led to the phenomenon of such
coalesced or consolidated units and sentences. This phenomenon is directly
related to the most striking feature of her phonology her use of coalesced
word patterns. For the phonological analysis of such routines, the surface form
of the unit phrase was treated as being the phonological form also (since the
component words did not occur separately). The phonological structure of
individually occurring words was determined on the basis of their most fre-
quently produced form in isolation.
One way of looking at Sis consolidation of routines and phrases would be
that these routines were in fact units for her, stored and used as wholes. This
interpretation would also be relevant for an analysis of her misperceptions,
imitations in which she misperceived the adult model and produced a different,
but phonetically similar word. In this study, the frequency of misperceptions
was characteristic of only Si among the six subjects who participated. For
example, the Es Armando was repeated by Si as Fernando, len lion as avin
airplane, gallo rooster as caballo horse, limn lemon as jamn ham, and
taza cup as casa house. In the misperceptions, the two words involved are
phonetically similar but rarely visually or semantically similar; they would be
called slips of the ear. Since the phonetic similarity between the words
involved tended not to consist of single segment or feature changes but rather
was based on some holistic similarity between the words, the misperceptions are
in this way similar to the unit routines. For the adult, words have a specic
segmental composition, but for Si words seemed to have in addition (and in
some cases, only) a general prosodic shape. As will be discussed, the phe-
nomena of unit phrases and misperceptions are similar in important ways to
aspects of her phonological development in that all three are suggestive of a
system in which the word is a unitary whole which has a general prosodic shape
(perhaps as yet unanalyzed segmentally) and which can be prosodically similar
in a loose way to other words.
4. Sis phonology at the beginning of the study (1;7)
Si is learning the phonological system of the variety of Spanish spoken by her
family and neighbors, almost all of whom come from the state of Michoacn in
Mexico. Since the Project did not analyze the phonology of the subjects
parents, and there are no phonological analyses of Michoacn Spanish, the
adult phonology to which Si is progressing is by necessity presumed to be that
of general Mexican Spanish, a variety that has received considerable attention
from linguists. Given the basic sound classes stops, fricatives, and sonorants,
the following divisions can be made. The feature of voicing is distinctive in the
stops /ptk/ versus /bdg/ and in the fricatives as /fsx/ versus []. However, the
voiced fricatives are allophones of the voiced stops /bdg/: [bdg] occur in
utterance-initial position and after certain sonorants; [] occur in all other
positions. Depending on the reference being cited, /t/ is classied with the
voiceless stops (Alarcos Llorach 1950), with the voiceless fricatives (Stockwell
and Bowen 1965) or as the single member of a voiceless affricative class
(Dalbour 1969). In this chapter, /t/ will be referred to as a member of the
voiceless fricative class. The greatest disagreement concerns the classication
of /w/ and /j/: (1) as semivowels (Stockwell and Bowen 1965); (2) as fricatives
(Dalbour 1969); or (3) in separate classes, the /j/ with /bdg/ as voiced or lax
sounds, and [w] as a phonetic nonsyllabic variant of /u/ or /gu/ (Alarcos
Llorach1950). The two sounds /w/ and /j/ will be referred to as glide consonants
in the present chapter. The class of sonorants is divided into the nasals /mn/ and
the liquids /l/. The symbol // represents the apico-alveolar single ap r, and //
stands for the apical trill r phoneme. All eighteen consonant phonemes may
occur in intervocalic position, and all but two (/ / and //) may appear in initial
position; in nal position, however, only /nsrld/ may occur. The symbol /r/ for
nal position represents a neutralization of the two r phonemes, and nal /n/
represents a neutralization of all nasal phonemes.
Si was 1;7 at the beginning of the study. During the rst month, her sponta-
neous production was limited to a very small set of words (N = 12). The set of
consonantal phonemes in her speech was correspondingly small: the voiceless
stops [p, t], the nasals [m, n], and the glides [w, j]. Sis phonemes corresponded
to the appropriate adult phonemes, with the following additional relationships:
/b/ was realized as /p/ or /w/; /k/ as |t|; /g/ as |w|; /, n/ as |j|; and // as |n| (in one
word, as a result of nasal assimilation). Phonetically, Sis nasals occasionally
were de-nasalized and the voiced bilabial stop was sometimes weakened to a
glide; a lenis articulation was common to much of Sis production throughout
the period studied. By far the greatest phonetic variation was seen in produc-
tions of mira look (e.g., [ja], [hi ja], [i ja], [mi ja], [bi j], [mi a] or [ j]).
A limited segmental system is typical of the phonology of a young child, and
the correspondence between the childs system and that of the adult is typically
captured by a set of substitution rules similar to those presented above. In
addition, childrens productions must be further described by the set of con-
straints that determine the ways in which consonants can be combined in words
of two or more syllables and the constraints that determine the number of
syllables that can be produced in any given word. The restriction of words to
one or two syllables is probably universal during early acquisition. Consonant
harmony a constraint that stipulates that if two consonants appear in a word,
they must be the same or highly similar is widespread (Vihman 1978) and is
frequently identied as a universal (Smith 1973). Although consonant harmony
may be either complete (i.e., involving both place and manner modications) or
partial (i.e., either place or manner), harmony involving all the features of a
segment as opposed to only one or two . . . is . . . characteristic of very early
speech (Smith 1973).
In Sis productions, all words were either one or two syllables long, as
expected. With regard to consonant harmony, the situation was more compli-
cated. Several words exhibited harmony but were productions of adult Spanish
words in which both consonants already agreed either partially or completely: in
beb baby, pap father, and guau guau bow wow, both syllable initial
consonants were of the same place of articulation and a highly similar manner.
One word rana frog was a clear case of complete harmonization ([na na]).
Dame give me was also (partially) harmonized ([pa me]): however, in all
subsequent productions of this word, the initial consonant was deleted.
In contrast, manzana apple was not produced with both consonants agree-
ing in place feature: manzana |ma na|. The completely harmonized produc-
tion which could have been expected is [na na], which was the production used
by all the other children. In the output formof [nana], a weak-syllable deletion
rule could be posited to explain the reduction of this word from three to two
syllables. Such a rule is quite common in early child phonology (Ingram
1974b), and a similar rule (initial syllable(s) deletion) was characteristic of
another subject in the study. In fact, Si used the initial syllable(s) deletion rule
several months later. In the beginning stages, however, she deleted the medial
syllable of manzana, and in general deleted syllables in a exible manner
consistent with the goal of producing a favored output form. The exible
syllable deletion rule, the absence of complete harmonization, and the use
of a favorite two-syllable canonical form (in which a word-initial labial con-
sonant combined with a medial dental consonant) as seen in the production of
manzana were to become typical features of Sis early phonological system.
Nio boy and nia girl were produced with an initial dental nasal and a
medial glide (i.e., // |j|). Here and during all subsequent stages, glides freely
combined with other consonants in two-syllable words. Evidently the non-
consonantal nature of glides exempted them from the constraints which limited
the co-occurrence of consonants. As will be seen, the liquids also combined
freely with other consonants during the stages in which they were realized as
either glides or liquids.
Thus, Sis phonological system during the month she was 1;7 was limited to
labial and dental stops and nasals and the glides. During the rst session, she
also occasionally babbled long sequences of syllables in which the consonant
segment was either [b], [p] or [w]; such babbling did not occur in later sessions.
However, during all four sessions of this period and during many sessions in the
rst several months, Si produced long utterances containing only one recogniz-
able word. In such sentences, the extra syllables contained the same con-
sonant as occurred in the recognizable word; thus, these sentences were
primarily labial or dental sequences. Si continued with her preference for labial
and dental stops and nasals in selecting favorite words, in her nonsense rhym-
ing, and in her sound play, all of which drew upon this set of consonants. This
set remained at the core of her phonology for several months.
5. Sis early phonological development (1;8 to 2;1)
In the preceding section, syllable deletion and consonant harmony were men-
tioned as major means by which many very young children simplify the
phonological form of adult words. Smith (1973) includes these in his list of
the four functions of child rules: (a) consonant and/or vowel harmony;
(b) consonant cluster reduction; (c) systemic simplication (e.g., the reduction
of adult contrasts); and (d) grammatical simplication (e.g., the absence of nal
s and hence of the singularplural contrast in English). Clearly, constraints on
the length of words, on the complexity of the childs phonological system, and
on the complexity of combinations of sounds in words operate universally to
affect simplication. However, there is ample evidence in recently published
papers that demonstrates that individual children may differ in the strategies
they adopt to achieve such simplication a fact recognized explicitly by Smith
in a later paper (Smith 1975): whereas the tendencies or strategies themselves
are universal, the rules which implement them . . . are child specic.
It is also true that in spite of the early and strong necessity to simplify the
adult phonology, children must ultimately learn the entire set of phonological
units which are contrastive in the language being learned. The phoneme is one
such unit that is traditionally recognized in phonological theory and is used
frequently as the basic unit in studies of child phonology (Smith 1973).
However, recent studies (Ferguson and Farwell 1975; Menn 1971, 1977; see
also Ferguson 1977a) have demonstrated that a more appropriate unit of
analysis for the corpora from very young children is the word.
2
In these
studies, phonological rules are not realization rules deriving a childs surface
formfroman underlying adult phoneme (as in the Smith framework) but, rather,
are formalizations of the strategies that a particular child has adopted to
represent words and classes of phonetically similar words (see Menn 1976).
Sis early development can best be accounted for within a framework that
recognizes the signicance of early words and word shapes in the development
of the young childs phonology and the variability with which individual
children implement the simplication processes (Ferguson and Farwell 1975;
Farwell 1976).
It will be seen that the use of the word (and word patterns) rather than the
adult phoneme (and phonemic contrasts) as the basic organizing unit of Sis
early phonology better explains the variation in words over time, the develop-
ment of canonical forms, the variable correspondence between adult phonemes
and Sis phones, and several additional phenomena that would be largely
inexplicable within a framework like Smiths which maps adultlike underlying
representations onto the childs surface forms. By the end of this period,
however, much of the evidence for a word-based phonology has disappeared,
and Sis productions during the period 2;2 to 2;5 can more easily be described in
terms of phonemic contrasts and related phonological rules. During the period
up to 2;2, Si was also learning contrasts between individual sounds and the
equivalences between similar sounds in different environments (i.e., phonemic
contrasts and allophonic relationships) and in fact during the period 2;2 to 2;5,
her phonetic realizations of adult Spanish phonemes in different positions,
environments, and words were much more regular Although the framework
of Section 6 which covers the period 2;2 to 2;5 will accurately reect Sis
transition from words to phonemes, this change is most obvious in the compar-
ison of the end state (2;5) with the beginning one (1;71;8); how the transition
precisely came about is not nearly so clear, primarily because the two develop-
ments overlapped considerably.
5.1. The learning of word patterns
Table 5.1 presents the development of word patterns in two- and three-syllable
words of the form #(C)(V)CVCV#. These word patterns capture the ways in
which constraints on the co-occurrence of consonants in words were gradually
relaxed; for this reason, word patterns of the form #(V)CV# (i.e., words with
Table 5.1. Development of word patterns in two- and three-syllable words
C
1
C
2
C
1
C
2
[+ front] [+ back] [+ back] [+ front]
C
1
C
2
C
1
C
2
C
1
C
2
Stage [ place] [ place] [+ labial] [+ dental] Other [+ velar] [+ dental] Other Age
I p/b__p/b__
n__n__ p__m__ m__n__ 1;7
II m__m__ p/b__t/d__ b__k__ (t__p__t__) 1;8.7
III k__k__ p/b__n__ 1;9
t__t__ p/b__nt__
IV t__t__ t__n__ m__s__ n____ 1;10.7
t__t__ f__n__
V t__l__ 1;11.7
t__nt__ p__l__ b__ __ k__t/d__
VI t__t__t__ p__n__ __n/t____ 2;0.7
p__(n)t__(n)
f__nt__
f__t__n__
p__s__ k__s__
VII (A) n__l__ p__n k__l__ n__f__ 2:1
n__t__ f__ t__ k__m__
s__n__ m__l__ s__p__t__
t__n P__t__n__ d__k__
Table 5.1. (cont.)
C
1
C
2
C
1
C
2
[+ front] [+ back] [+ back] [+ front]
C
1
C
2
C
1
C
2
C
1
C
2
Stage [ place] [ place] [+ labial] [+ dental] Other [+ velar] [+ dental] Other Age
(B) l__l__ b__mb__ b b s
f n s
j b s
m s
g
s#
b j n
s__p
l__p__s
2:2
Notes:
1. All word patterns in the table correspond to two syllable productions except the following: t__p__t and s__p__t__ for zapato (stages II and VIIA); t__t__t__ for
chachita (stage VI); f__t__n__ for telfono (stage VI), and p__t__n__ for manzana (stage VIIA). Only one other three-syllable production occurred during this period:
k__w__j__ for caballo (stage VIIA).
2. = only one occurrence of one word.
3. ( ) = an optionally occurring consonant
4. / = either consonant may occur.
5. L = cover symbol for a liquid which had various phonetic forms.
only one consonant) are not included. The set of consonants included in this
table includes stops, nasals, fricatives, and liquids; from the beginning stage
(1;7), the glides could freely combine with other consonants and hence have
been omitted from the table.
In the rst column are the word shapes in which both consonants agreed in
place. Although in adult Spanish /t/ is a palatal affricate and /tdns/ are referred
to as dental consonants, they are here considered to agree in place because
Sis productions of these sounds did not conform to the adult contrast: she
usually produced all ve consonants as [+ alveolar], but her phonetic range
covered the entire dimension of dental to palatal. These sounds will be referred
to as the class of dental consonants, following the typical nomenclature for the
adult phonemes /tdns/.
The most interesting developments in word patterns are seen in columns II
and III. Here, it is clear that Si preferred the order of consonants in a word to be
[+ front] in initial position and [+ back] in medial position: all nal consonants
were deleted until stage VII. Moreover, the preferred initial consonant was a
labial one and the preferred medial was a dental. This preferred front + back
ordering accounts for the output form of all words containing a place contrast
from 1;7 to 1;11.15, with the exception of the pattern [t__p__t__] used only for
zapato shoe (stage II). This pattern appears in parentheses in the table, because
it was not a productive word pattern (i.e., it was used for no other words nor for
the generation of additional word patterns). The early syllable structure accu-
racy (stage II) of zapato is unusual for two reasons: (1) it was lost during stage
III, and (2) no other three-syllable productions were regularly produced until
stage VI. Two other words, elefante elephant and manzana apple, were also
produced with a three-syllable form for a brief period before being regularized
to a two-syllable form (see Moskowitz 1971 and 1973 on phonological idioms).
Up to stage V, all initial consonants in column II class words were labial.
During stages V and VI, a pattern emerged which violated the front + back
ordering: the new word pattern contained the other member of the [+ grave]
class (a velar stop) in initial position, with a dental consonant still preferred in
the medial position. That the new pattern was not of the form dental + velar is
signicant and demonstrates that Sis preference was not simply a fronting
strategy (see Ingram 1974a on fronting in child phonology).
The data in columns II and III also show that the process by which Si
expanded her repertoire of word patterns was one in which new patterns were
created out of existing ones. The patterns of [p/b__n__] and [p/b__nt__] of
stage III represent a combination of the patterns [m__n__] and [p/b__t/d__] of
the earlier two stages. In stages IV through VII, the set of possible initial
consonants was expanded to include the remaining labial consonants, while
the set of possible medial consonants was expanded to include several addi-
tional dental consonants. The creation of new word patterns on analogy with
existing ones is seen most clearly in the great expansion during stage VI, when
many new words and new word patterns were acquired. In this stage, Si
expanded her general labial + dental pattern to include nearly all the possi-
bilities of appropriate consonant co-occurrence. She also overgeneralized
[+ velar] in initial position to include a velar nasal, a sound that has no phonemic
status in adult Spanish and occurs in syllable-nal position only (as an allo-
phone of /n/ before velar consonants).
The use of [] only occurred during a two-week period and was restricted to
productions of three words: rana frog [wa a, ga , a na]; bola ball
[a w]; and gato cat [ak to, a ko]. The productions for gato occurred
only in imitations; Si rarely used this word spontaneously. Gato was produced
with either an initial [n] or an initial [k] for a three-week period during the
following stage and subsequently stabilized with an initial [k]. Prior to stage VI,
gato had been produced as [ka ko].
The ways in which Si expanded her repertoire of word patterns by combining
and/or expanding existing patterns point to the signicant roles that overgener-
alization and analogy play in the acquisition of phonology (as they do in the
acquisition of syntax and semantics). Sis phonemicization of the velar nasal
strikingly demonstrates these processes and shows that it is not always the case
that the childs phonemes correspond directly to the representations and feature
assignments of adult phonology; more importantly, it points to the creativity
exhibited by the child in his/her role as the active organizer of phonology. This
latter fact has only recently begun to be recognized (see Moskowitz 1971); most
previous discussions of phonological acquisition assumed the role of the child
to be passive. Recent work by Kiparsky and Menn (1977) assigns a signicant
role to the childs creative role and characterizes phonological acquisition as
inherently a cognitive (i.e., problem-solving) task. Clearly this approach ts Sis
development well.
Words during these stages fell into several types of patterns and Table 5.1
demonstrates the regular nature of word pattern development over time. These
facts in themselves suggest the importance of words and word patterns to Sis
early phonological development. More convincing evidence comes from the
exibility with which several processes operated on individual words to pro-
duce the preferred word patterns, several unusual substitutions and the change
in form of several words which occurred as new word patterns were learned.
Table 5.2 presents the rst two-word patterns learned by Si and shows the
ways in which words were processed to t the output goal. The rst column
gives the words that were selected to t particular patterns. The next four
columns correspond to four processes syllable reduction, metathesis, substi-
tution, and consonant cluster (CC) reduction that Si used to simplify adult
words. The last column gives a phonetic transcription of the most typical output
form; when two forms occurred, both are given.
The word pattern [p/l__c/d__] was much more productive than [m__n__];
however, the same statements can be made for both pattern types, and the
general point of interest here that of the variable nature of the processes
applies to both equally. The process of syllable reduction operated to delete
Table 5.2. Analysis of the rst two word patterns acquired by Si: [m__n__] and [p/b__t/d__]
Processes
Word pattern Stage Age Words Syllable reduction* Metathesis Substitution CC-reduction Sample producton
(A) [m__n__] I 1;7 manzana manzana m w mnna
mano wainno
II 1;8.7 Fernando Fernando
Fernando
f m
m w
nd n
"
mann
wanno
nanno
#
Ramn Ramn m b ~ w ~ m mn
VII 2;l comiendo comiendo ndn minnu
(B) [p/b__t/d__] II 1;8.7 pelota
pata
pato
zapato
pelota
zapato
)
o a
(assimilation)
patda
pa ta
ptda
"
pwatto
bwaddo
#
sopa
Vicki
sopa
kt
pwta
"
wt ^ t
bjk ^ ke
#
III 1;9 elefante elefante f b ~ p nt t batte
[
librito
]
librito br b
}
pitd
libro libro l d br b
IV 1;10.7 vestido vestido i, o i bit-ti
perro i d b d
plato pl p pw t
Table 5.2. (cont.)
Processes
Word pattern Stage Age Words Syllable reduction* Metathesis Substitution CC-reduction Sample producton
VI 2;0.7 bota
pastel
l n ~ st t pattj
pttn
VII 2;1 reloj b
}
buddo
l d
p ~ l
l l }
[
hlllo
]
pllo
Notes:
1. The variability in the voicing of stops is not listed under substitutions; see Section 5.2 for discussion of the acquisition of the voicing contrast.
2. Words are listed under the rst stage in which they appeared.
3.* = Deleted portion italicized.
either the initial syllable(s) or the medial one. The choice of which was deleted
depended crucially on the consonants in the syllables. Sis goal was to achieve
an output form of labial + dental. If the labial consonant occurred in the
correct initial position, then the medial portion of the word was deleted:
manzana apple, Fernando brothers name, pelota ball, and vestido
dress. If, however, the labial consonant was in medial position in the adult
word, the preceding syllables were deleted: Ramn brothers name, comiendo
eating, zapato shoe, elefante elephant, and librito little book.
3
In the cases
in which medial segments were deleted, the situation was more complicated
than just the deletion of syllables. The general phonetic quality of the vowels
that appeared in the output forms suggests that Si tended to retain the vowel of
the stressed (penultimate) syllable; this vowel was then combined with the
word-initial consonant to form the rst syllable of the output form (e.g.,
Fernando and vestido, Table 5.2; see Menn 1974 on the essentially universal
preservation of the stressed vowel in child forms).
In both syllable reduction types, the adult word functioned as a single
prosodic unit, the features of which were changed to t the output goal. All
the adult words that were selected by Si to t her patterns of [m__n__] and
[p/b__t/d__] had a labial and a dental consonant somewhere in the word; this
requirement is crucial for distinguishing between words that underwent meta-
thesis as opposed to harmony. Metathesis an uncommon process in child
phonology occurred in Sis speech but not in the speech of another subject a
subject who used (complete) harmony almost exclusively to simplify adult
words (Macken 1978). Si also used (complete) harmony but considerably less
than would be expected on the basis of its documentation in the literature. The
only words which underwent metathesis were those words that contained labial
and dental consonants in the wrong order (e.g., sopa soup, Table 5.2B; and
telfono telephone, Table 5.3B). In contrast, words that exhibited complete
harmony were words that had an incorrect ordering of consonants, but lacked
one of the pattern-criterial consonants (e.g., gato cat [ka ko] stage III). The two
words in the earliest stage that also exhibited harmony (rana [nana] and
dame [pame]) do not t the rule just stated. However, from stage III on, rana
was produced just as often with an initial bilabial phone as with an initial [n].
During stage I, Sis production for dame changed to [ame]. It seems that these
two words were acquired prior to the point (stage II) at which Si settled on her
[labial + dental] pattern preference and thus were exempt to some extent from
her later rules. Cases of metathesis and (complete) harmony were not common
in Sis data.
Substitution processes also operated in different ways, depending on the
consonant structure of the adult word and the requirements of Sis patterns.
For example, in Fernando (which contains a word-initial labial with a nasal as
the rst consonant of the stressed syllable), /f/ |m|, while in elefante, /f/ |b/
p| (Table 5.2); in perro dog, // |d|, while in reloj watch, //|b|
(Table 5.2). Further examples of substitutions being determined by word
patterns will be seen in Table 5.3. In contrast to the goal-directed and, hence,
variable nature of the processes of syllable reduction, metathesis and substitu-
tion, the fourth process needed to explain surface forms the process of
consonant cluster reduction was very systematic: nasals were deleted when
followed by a voiceless stop; voiced stops were deleted when followed by a
nasal; liquids were deleted in all clusters; and fricatives were deleted when
followed by a stop (Table 5.2).
Table 5.3 contains eleven words and charts their development through stages
I to VII. Most of these words demonstrate unusual correspondences between
adult phonemes and the phones in Sis productions. Several of these unusual
substitutions have already been mentioned. Throughout this period of 1;7 to 2;1,
/t/ |p/b| only in tenedor fork (Table 5.3B), /m/ |p/b| systematically only in
manzana apple (Table 3A) and Ramn brothers name, and // |p/b| in
initial position but |l| in intervocalic position (reloj versus perro, Table 5.3B).
The rst three fricative words that Si acquired were also unusual: /f/ [t] in
Fernando and elefante; and /s/ |f| in manzana. Although the phonetic
realization of /f/ in Fernando and elefante was [t] (phonetically very similar
to the adult phoneme /t/), it is possible that the source of the substitution error
was a confusion of /f/ and /s/. In contrast to these three words, all subsequently
acquired words were realized with correct /f, s, t/ contrasts.
With the exception of the words involved in the /f/ and /s/ reversal, all the
unusual substitutions can be accounted for by the overgeneralization of Sis
preferred word patterns, although this pattern force in itself cannot explain
why these particular words were susceptible and not others (see Labov, Yaeger,
and Steiner 1972 on the riddle of actualization). As will be seen, prosodic
similarity between certain adult words provides a plausible explanation for
the similar treatment of some words.
Table 5.3A shows the developmental changes in six words. At stage III,
manzana changed from |mana| to |p/bana|, due to the overgeneralization of the
[p/b__n__] word pattern; at stage IV, it changed to |fana|. This latter change
exemplies Sis tendency to combine features from different segments of the
adult word: the labiality of the initial /m/ and the frication and voicelessness of
the medial (stressed syllable) /s/. The odd development occurs in elefante,
which had been |pante| during stage IV but changed to |tante| during stage V;
the expected development would have been [fante]. Two possible explanations
can be offered: words as prosodic units; and confusion between adult fricatives.
In the words as prosodic units explanation, two factors may be relevant: (1)
within-word combination of features from different segments; and (2) cross-
word prosodic similarity. (1) First in manzana, the combination of features
[+labial] and [+fricative] resulted in |f|; in elefante, it may be that the change of
/f/ to |t| was a result of the inuence of the [+dental] feature of /l/, although
such an interaction would be anomalous in Sis treatment of liquid words. (2)
Note that in the preceding stage (IV), cuchara spoon was usually produced as
|ta na|; cuchara also contains a dental liquid in the adult form. To Si, perhaps,
Table 5.3. Acquisition of word patterns as shown in developmental stages of selected words
I II III IV V VI VII
[m__n__]
[
b__d__
]
[p/b__n__] [t__t__] [p__l__] [p__n__] [p__n]
p__t__ [p/b__nt__] [t__n._] [b__(n)t__(n)] [f__t__]
[f__n._] [f__nt__]
Word [m__s__] [f__t__n__1]
(A)
manzana m na
[
bwnn
]
[fann]
[
panna
]
[
panna
]
[
tanna
]
pwnn fanna tanna ptanna
manna anna
elefante
[
ba te
]
pwanti
[
bwante
]
[
bante
]
[
bante
]
pantl tante tante fante
fante fadti
cuchara
[
fanna
]
!
[
fann
]
fadd a la
Fernando
[
manno
]
[
ma no
] [
mn tu
(1)
]
tinal to
(1)
wanno janto meando
nanno
telfono
[
fn tonno
]
fwe fnno
fwa tin, nu
(B) rana na na
[
wanna
]
[
nann
]
[
manna
]
nna manna znn
wann
wanna
[
gan k
]
anna
na k
Table 5.3. (cont.)
I II III IV V VI VII
[m__n__]
[
b__d__
]
[p/b__n__] [t__t__] [p__l__] [p__n__] [p__n]
p__t__ [p/b__nt__] [t__n._] [b__(n)t__(n)] [f__t__]
[f__n._] [f__nt__]
Word [m__s__] [f__t__n__1]
reloj wi jo
(1)
nen no
(1)
ha do
[
bddo
]
winno
hlllo
?jjo
ne o
pllo
(I)
ratn (p ln) tn
(I)
pntn pdn
tenedor b
dd
(I)
pnnn pnt:
pastel
[
pt t
] [
pi t
e
]
pt tn pi t
perro (p lo) b d p o pe do
pe lo p d do
Notes:
1. Variable forms are listed in order of frequency.
2. [ ] = variable forms, given when a stage was characterized by signicant variation.
3. ( ) = questionable referent.
4. = word form remains unchanged throughout period indicated by arrow.
5. Subscript (I) following a form indicates that it occurred in imitation only; imitated forms are given only if they differed signicantly from the spontanteous form.
6. Two sets of variable forms are given for rana; those contained in the second group occurred only during the last two weeks of stage VI.
the prosodic similarity of cuchara and elefante was greater than that between
manzana and elefante, which would account for the same pattern having been
adopted for elefante. Similarly, Fernando was either [ma no] or [na no] during
stage II. Here two processes were in competition. In [ma no], the preference for
a labial consonant in initial position resulted in the feature combination of
labiality of initial /f/ and the nasality of the (stressed syllable) /n/ to produce
m. In [na no], the initial unstressed syllable was deleted. In both the cluster was
reduced. During stage VI, Fernando was added to the prosodic class to which
elefante and cuchara belonged, a change which resulted in the newform[ti nal
to]. Throughout this period, Si regularly produced an initial t in imitations of
this word. It was common throughout this period for old words to change as new
word patterns were acquired a phenomenon which presumably was due to
changes in Sis hypotheses about which words and sounds were similar and
should be said in similar ways.
The alternative explanation would be that Si simply confused /f, s. t/.
However, the production of |f| for /s/ in manzana is consistent with Sis general
tendency to combine features from segments of the adult word, and the pro-
duction of |t| for /f/ in elefante and Fernando could be a further example of Sis
general tendency to overgeneralize word patterns to new instances of words that
were prosodically similar in some way. If Si confused these fricatives, she
apparently did so only in these three words: all subsequently acquired words
containing /f, s, t/ were produced correctly, and some of these words were
acquired during the same time period that the three words showed the reversal.
Neither of these explanations words as prosodic units or fricative con-
fusion is necessarily correct, or they both might be. The word as prosodic
unit though does seem to be a factor in Sis treatment of tenedor fork. In
Table 5.3B, rana frog, reloj watch, and ratn mouse show how Si selected
initial // words (which also had a medial dental consonant) to be members of the
[p/b t/d ] word-pattern class. A possible explanation for why tenedor was also
selected to be a member of this class is that the nal /r/ (of the stressed syllable)
made this word prosodically similar to the class of initial // words, in which case
the labiality of the /r/ (phonemically |b| in initial position for Si) combined with the
voicelessness of the initial /t/ to result in the output [p].
The force exerted by preferred word patterns provides a plausible explanation
for the otherwise inexplicable treatment of many adult words and phonemes.
However, the same pattern force which frequently caused words to change
phonological form also caused variation which was not so easily interpreted.
The variation seen in Table 5.3B for rana was typical of many words and also
points to the problem of knowing whether an output change was due to the
phonological reorganization of a word (as in the overgeneralization of a newly
acquired pattern seen in manzana, stage III, Table 5.3A) or whether it was due to
a phonetic slip-of-the-tongue. In Fromkins theory (1973), slips-of-the-
tongue are not random errors made in speech production but are rule-governed
errors that systematically reveal the nature of the rules of the grammar.
Although Fromkins theory of speech errors has not been applied to child
phonology, it may provide an explanation for some of the variation seen in
Sis productions of words. In rana, for example, the earliest and most com-
monly produced form was [na na]. The variation seen in other productions of
this word could be explained as slips, the nature of which was determined by
the rules of Sis grammar (i.e., the force of word patterns). During stage IVwhen
the pattern of [t__n__] was established, rana was imitated a few times as [ta
na]. Its variant form [wa na] during stage VI could either be a phonological
reorganization (patterned after ratn) or a phonetic slip in the direction of a
close and strong pattern. In either interpretation, the production results in a
form that is consistent with other aspects of her phonology. A similar argument
can be made concerning the variation seen in reloj.
The variation seen in the words presented in Table 5.3 is in many ways typical
of Sis productions of particular words, and this variation had many causes. As
previously mentioned, some variation was due to the phonological reorganiza-
tion of old words as newword patterns were learned; similarly, the variation of a
word during a single time period was often due to the overlap of stages of
development. It has also been suggested that the infrequently occurring forms of
some words variable forms which nonetheless exhibited some regular pat-
tern can be attributed to the force of the word patterns (i.e., rule-governed
slips-of-the-tongue). All the causes of variation discussed in this section
resulted in phonologically revealing productions; in Section 4, randomly occur-
ring phonetic variation was mentioned and related to Sis typically lenis pro-
duction. One other type of variation was seen in a few words during the earliest
months of the study; this was variation that apparently stemmed from Sis
experimentation as she searched for an acceptable way to pronounce a
particular word. The most striking example of this experimentation was in
Sis productions of elefante prior to the point at which it stabilized as |ba te|
(Table 5.3B, stage III): [
l
hwan tu ti], [pfan tin di], [pan ti], [
1
ban tin di] (early
stage III). Its subsequent variation i.e., between [ba te] and [p/ban te] was
due to the overlap of two stages, with only the earlier one characterized by
obligatory cluster reduction. It may be that the occurrence of three different
rules discussed previously (i.e., harmony (rana and dame), initial-consonant
deletion (dame) and labial + dental word pattern (manzana)) was also due to Sis
experimentation in this case a search for a rule that would simplify word
structures (cf. Menn 1971 on Dannys discovery of harmony rules).
In summary, the evidence for the primacy of word patterns as the organizing
principle of Sis early phonological development has been the following: (1) all
words had a consistent word pattern form; (2) the gradual development of
classes of word patterns can best be described as a process by which new
patterns resulted from the expansion of previously acquired word patterns;
(3) some words changed pattern over time as new word patterns were learned;
(4) three of the four simplication processes operated to produce favored word
patterns as output; and (5) several unusual phonological substitutions and some
phonetic slips can only be explained by the notion of pattern force. It will be
seen (Section 6) that errors during the late stage of acquisition were usually
frozen forms of earlier word patterns that proved to be particularly resistant to
change (cf. regressive idioms, Moskowitz 1971 and 1973).
5.2. The learning of phonemic contrasts
As signicant as Sis word patterns are, there is evidence throughout this period
that Si was also learning the phonemic contrasts of Spanish. This evidence is of
two types: (1) the close correspondence between Sis word patterns and the
consonant structure of words that were selected as members of each pattern
class (which shows both an early ability to segment adult words and an early
recognition of place of articulation differences); and (2) the close correspond-
ence between some of Sis phonemes (as determined by different sets of phones)
and the phonemes of adult Spanish.
Fromthe very beginning, the close correspondence between the labialdental
place contrasts of Sis patterns and the place contrasts of the adult words that
comprised each pattern class demonstrates Sis recognition of the differences
between these places of articulation a necessary precursor to phonemic
learning. The velardental contrast in stops which was merged during stage
I (/k/ |t|) was produced by Si during stage II and was established by stage III
in all words except gato cat. Gato, which was subject to the early ordering
constraints, persisted as [ka ko] until stage VII, at which time it was produced
either as [ka to] or as [d/ta ko]. The latter form contradicts the metathesis rule of
the early stages and was the only example of metathesis involving a velar +
dental sequence in the entire corpus. By stage VII, the early metathesis rule had
dropped out. It may be that a new metathesis rule had replaced the earlier one;
since only the one word (gato) was affected, no rule is set up.
In addition to the separation of places of articulation, Sis stage I productions
showed a phonemically relevant distinction between manners of articulation
nasal/oral. The resulting four-way contrast of labial/dental and nasal/oral pro-
duced the four consonantal phonemes |ptmn|. These are listed with the glides as
the phonemic inventory for stage I in Table 5.4, which lists the order in which
consonants achieved phonemic status during stages I through VII.
During stage I, voicing was not distinctive, nor was the contrast between the
dental and palatal nasals (// |n|); Section 4 discussed all aspects of the
neutralization of adult contrasts seen in the substitutions of stage I. During
stages II to VII, Si began to distinguish the voiced/voiceless contrast in at least
the labial stops, acquired the palatal nasal, the stop/fricative contrast, and a
three-way contrast within the voiceless fricative class, and achieved a rudimen-
tary two-way contrast within the liquid class (Table 5.4).
Within the stop class, a three-way place contrast was accomplished by stages
II/III; the acquisition of the voicing distinction was more complicated.
Throughout stages I to VII, /b/ was usually produced as [w]; in the infrequent
Table 5.4. Acquisition of consonant phonemes
Nonnal position
Stage Stops Nasals Glides Fricatives Liquids Medial consonant clusters only Final position Age
I pt mn wj 1;7
II k (n) 1;8.7
III b t 1 nt 1;9
IV s 1;10.7
(f)
V ()* (s) 1;11.7
VI f n 2;0.7
VII (mb) (1) 2;1
s
Notes:
1. ( ) = rst appearance.
2. = drops out after two weeks.
3. * = phonemically distinct but phonetically not adultlike.
cases where it was produced as a stop, it typically was voiced. The phoneme /p/
was usually voiced during stage I (although the number of tokens was small)
and was either voiced (28 tokens) or voiceless (20 tokens) during stage II; from
stage III to stage VII, it was rarely voiced. Further evidence of Sis contrast
between /b/ and /p/ can be seen in substitution patterns: /b/ [b, , w), rarely
[p] and [] in bola ball; /p/ [p, b], once [v], and never [w]. The [] which
occurred for /b/ in bola was analyzed as having separate phonemic status. The
adult contrast between /t:d/ and /k:g/ was absent. Of the dental pair, words
containing /d/ were very rare, and the /d/ was deleted in four out of the total ve
productions of /d/ words. The phonetic variability of /t/ was very similar to that
for /p/: usually voiced in stage I; either voiced (29 tokens) or voiceless (18
tokens) during stage II; and almost always voiceless during stages IIIVII ([t],
45 tokens versus [d], 4 tokens). The allophones of |t| were [t, d], and [k] (in gato,
as a result of velar assimilation). Tenedor was consistently produced with an
initial [p] that was analyzed as being phonemically unrelated to [t| (Section 5.1).
The adult phoneme /g/ was as rare as /d/; it occurred only in gato (where Si
treated it as |k|) and in guau guau (where the initial /g/ was phonemically |b| in
Sis system). On the basis of the differences between Sis productions of /b/ and
/p/, it can be argued that Si had a phonemic contrast of voicing in labial stops by
stage III; however, she had not fully mastered the phonetic control of voicing.
Sis dental and velar stops were phonemically and phonetically voiceless. The
Spanish phonemes /bdg/ also have voiced fricative allophones in medial posi-
tion; since Si had no voiced velar and dental stops, it would be reasonable to
expect an absence of [] and [], which in fact was the case. In the two words
guau guau bow wow and agua water where [] could be expected, Si
regularly produced |w| (although the glide was often produced with some velar
friction). For the initial segment of guau guau, Si usually produced [b], which is
the adult initial phone in this word in Spanish baby talk. The fricative allophone
of /b/ began to be produced in stage IV, but it occurred as often in initial position
as in medial position; Si gave no evidence of having the complementary
distribution relationship between /b/ and [].
The nasals /m/ and /n/ were phonemic in Sis system from the beginning, and
at least |n| was phonetically stable throughout all stages. The labial nasal,
however, was occasionally denasalized: |m| [m] occasionally [w] (3 tokens),
regularly [p] in manzana, and [m, w, or w] in Ramn. These latter substitutions
(in manzana and Ramn) were discussed in Section 5.1 as the outcome of
pattern regularization: at stage II, when manzana changed from [ma na] to [pa
na], the word is assumed to have been phonologically reorganized and the initial
segment is analyzed as |p|; in Ramn, two phonological forms are in competi-
tion |mon| (due to initial syllable deletion) and |bon| (due to medial segment
reduction). The palatal nasal was merged with |n| during stages I to III; during
stage IV, it was distinguished from |n| (|| [, j, n], a pattern of phonetic
variation typical of stages IVto VII). During stage VI, Si used ||, a sound that is
not phonemic in adult Spanish (Section 5.1).
When fricatives were rst being produced (stages IVV), /f/ and /s/ were
phonetically interchanged in the words manzana and elefante. As with the [] in
bola, the [p] in tenedor, the [p] in manzana (stage III), and the [w] in Ramn, the
phonemic interpretation of [f] in manzana and [t] in elefante could be handled in
various ways, as for example by positing underlying adult phonemes. However, as
in the other cases, the less abstract solution was adopted: the surface form was
assumed to be the phonemic one (|f| in manzana; |t| in elefante).
All subsequently acquired /f/ and /s/ words were correctly produced, and the
different patterns of variation for the phonemes /f, s, t/ argue for their phonemic
status in Sis system at least by stage VI. Phonetic control over voicing was not a
problem from stage III for |t| and from stage IV for |s| and |f|, stages at which the
adult phonemes were rst realized as fricatives in Sis words: |t| was voiceless 29
out of 32 times; |f| was voiceless 20 times (of which 6 occurrences were in
manzana) and was produced as [w] 3 times; |s| was always voiceless ([t] 21
tokens, [] 3 tokens, [s] 21 tokens, [, ] 6 tokens, [ts] once and [h] 4 tokens).
However, during the stages when /f/ and /s/ were realized as stops (primarily
during stages II and III), voicing was variable: /f/ was a voiced labiodental stop 12
out of 18 times, and /s/ was a voiced alveolar stop 23 out of 32 times. Although
|s| which showed the greatest phonetic variation of the three fricatives was
frequently [t], the adult phonemes /t/ and /s/ were distinct for Si: /t/ was never
produced as [s]. In spite of the fact that most (14 out of 21) of the instances in
which |s| was produced as [t] occurred in productions of manzana, it is likely that
this was due to the high frequency of manzana among Sis /s/ words, rather than to
any confusion of /s/ and / t/. Words containing the velar fricative /x/ were not
common: /x/ was deleted in both conejo rabbit and reloj watch.
The class of liquids showed the greatest variation during all stages. The
lateral was the rst liquid acquired both phonetically and phonemically. It rst
appeared during stage III in bola ball, but was regularly deleted in three other
words. From stages III to VI, it was produced only twice. During stage VII the
rst stage at which it was produced with any regularity , it was realized as [l, j]
in intervocalic position and as [n, l] in initial and nal position. It was produced
as [d] only in production of reloj where the initial // was produced as a [b] (see
the discussion of pattern force and slips-of-the-tongue in Section 5.1). The //
was completely merged with |l|: it occurred in two words where it was produced
as [n, l]. In mira look, however, // was regularly produced as [j] (see Section 4
on the exceptional nature of this word). The // phoneme was similarly merged
with |l| in stages I through IV: // [l, n] (and one time each as [w d, ],
substitutions which may argue for a preliminary contrast between /l/ and //).
During stage V, Sis productions for // changed: // [l, d, d] in intervocalic
position (occasionally as [n]). However, in initial position, // [p] in ratn and
to [b, , l] in reloj. In rana frog, the // was either [n] or a labial consonant [m,
w, w]; for two weeks during stage VI, it was a velar nasal (Section 5.1). Clearly
the phonetic variation seen in initial position // does not warrant the assignment
of phonemic status, and it was analyzed as |b|. In this analysis, the [p] in ratn
was due to voicing agreement with the intervocalic |t|. In reloj, the initial [l] phone
occurred only when the medial |l| was produced as [l]; both consonants then
agreed due to lateral assimilation. In rana (a word acquired before the labial +
dental word pattern was established), the predominant initial phone was [n]; the
phonemic status could be either |n| or |b| (if |b|, then instances of the nasal phone
would be due to nasal assimilation). In fact, the earlier form [na na] occurred
during the stages in which // was merged with |l|; that rana may have been
subsequently phonologically reorganized with an initial |b| would explain why
both forms occurred. In medial position, the differences between the phones that
were realizations of // (at least by stage V) and those that were realizations of /l/
demonstrate that Si had at least made a rudimentary two-way phonemic distinc-
tion (in medial position only) within liquids: |l| (/l, r/) versus //.
Medial consonant clusters and nal consonants are included in Table 5.4 only
for convenience; they are not considered to be separate phonemes. The cluster /nt/
was the rst cluster to be acquired. In nal position, /n/ was acquired during stage
II and /s/ during stage V. However, both these nal consonants were found in only
one word each. They were not produced regularly until stage VI for /n#/ and stage
VII for /s#/. A third consonant /l/ appeared in nal position during stage VII, but
was restricted to only one word and was not consistently produced.
In this section, the acquisition of Sis phonemic contrasts has been described.
The correspondences between adult Spanish phonemes and Sis phonemes can for
the most part be explained by the following general processes: (1) deletion;
(2) phonemic merger; (3) voicing instability; (4) weakening; (5) strengthening;
(6) nasalization; and (7) place instability. In addition, the productions of some
words were lexical exceptions to general rules. However, several of Sis produc-
tions presented formidable problems to a strict phonemic analysis. The unusual
nature of the correspondences between the phones in these productions and the
phonemes in the adult words is apparently related to the ways in which Sis output
was determined by existing word patterns and Sis tendency to combine features
fromdifferent segments in the adult words. The explanation for why certain words
were given the same output patterns seems to lie in some general prosodic
similarity of the words involved (a phenomenon perhaps similar to the prosodic
schemas reported in Waterson 1971). The changes over time in output pattern for
some words appeared to be due to changes in Sis hypotheses about which words
were similar and should be said in similar ways. The unusual correspondences
often the result of feature combinations and the changes in words incurred as a
result of pattern force made a traditional phonemic analysis of words such as
manzana, elefante, rana, ratn, and tenedor quite difcult and decisions regard-
ing phonemic structure somewhat arbitrary.
6. Sis development from 2;2 to 2;5
During the period from2;2 to 2;5, Sis phonological systemimproved greatly. Her
set of phonemes expanded to include nearly all the phonemes of adult Spanish.
Although each of her phonemes had several variants, her phonetic control was
much improved and the unusual substitution patterns characteristic of the earlier
period largely disappeared. The constraints on the co-occurrence of consonants
that had previously been a function of the word pattern goals also disappeared; the
only exceptions were frozen forms from the earlier period. In terms of syllable
structure, most two-syllable and many three- and four-syllable words were pro-
duced accurately. Those longer words that did not have the correct syllable
structure were reduced by means of a single set of rules: initial syllable deletion;
consonant cluster reduction; and an optional initial and/or nal consonant deletion
rule. These rules applied to all words, irrespective of the overall consonant
structure of the word; this systematicity is in sharp contrast to the variability
seen in earlier syllable simplication processes. As in the case of rules governing
segments and the co-occurrence of consonants, the only exceptions to the new
syllable structure rules were words which were frozen forms from the earlier
period.
6.1. Segmental system
By the end of the earlier period, Sis segmental system included phonemes
equivalent to all but four of the adult Spanish consonantal phonemes. In initial
and postconsonantal position, /d/ was merged with |t| and /g/ with |k|. In medial
position, the voiced fricative allophone of /d/ was treated either as |t| or as a
member of Sis liquid class, while the voiced fricative allophone of /g/ was
phonemically |w|. Two other phonemes were absent in Sis system as of 2;1: /x/,
which was deleted, and //, which was merged with |l|. By 2;5, Sis productions
showed some evidence for the contrastive status of each of these four adult
phonemes, although, as with most of her other phonemes, the phonetic realiza-
tion was variable. The allophonic distribution of the voiced stops and voiced
fricatives was not established.
Phonetically, voicing was still not completely mastered. The stop phonemes
|pbtd| occasionally showed voicing errors; |g| was often devoiced, while |k| was
never voiced. Among the fricatives, |f| and |s| were occasionally voiced, and
both exhibited a large amount of manner variation: |f| [f, fw, w, w, pw, b, v,
pf, w]; and |s| [s, , , , h, , t, t, d, d, ). In contrast, both |t| and |x| were
phonetically relatively stable: |t| [t], rarely [t, d]; and |x| [x, h, ] and []
only in jugo juice. In the English words shoes and home, Si used [t] and [h]
respectively. The nasals were phonetically very accurate, although the palatal
nasal continued to be produced as a nonnasal glide on occasion. The liquids
were by far the least stable phonetically, and in the case of the two r phonemes
bore little resemblance to the adult articulations: |l| [l, n, ], rarely [d]; ||
[l, j, d, , ]; and || [, l, n, d]. The fricative [], the intervocalic allophone of
/d/, was usually produced as [d], but occasionally as [l, n]; thus, its set of phones
was identical to the set of phones for |l|. Since initial-position |d| was never
produced as [l] and the relative proportions of [d] and [l] phones for [] and |l|
differed signicantly, Si at least contrasted /d/ and /l/ and, in most cases,
correctly treated [] as phonemically related to /d/. The infrequent cases of
[] [l] appear to be the last traces of her earlier indecision regarding the
phonemic status of adult [] (1;7 to 2;1).
The most signicant aspect of Sis phonemic systemduring 2;2 to 2;5 is that all
the unusual substitutions that characterized the earlier time period disappeared: no
new words were produced in ways consistent with the earlier pattern-dominated
substitution rules, and most old words were produced in a manner consistent with
the above description of phonemes and allophones. Only ve words persisted in
the earlier word pattern form: vestido, |#b| |p|; la nia/o, /#l/ |n|;
4
guau guau,
/#g/ |b|; Fernando, /#f/ |t|: and tenedor [b d].
6.2. Co-occurrence of consonants
During the period 1;7 to 2;1, the co-occurrence and ordering of consonants in a
word were determined by word patterns, primarily the patterns labial + dental
and velar + dental. In a small number of words, Si used metathesis as a way to
achieve the preferred ordering of consonants. During 2;2 to 2;5, all sequences of
consonants occurred, although there were still a few words in which the order
was dental + labial (in the adult model). In only one word did Si metathesize the
consonants: this word was gato cat, which already had appeared as [d/ta ko] in
the last stage of the earlier period and which remained in this form throughout
the study. All other words preserved the adult ordering of consonants.
In initial and medial positions, singleton consonants were produced accord-
ing to the description of Sis phonemes and allophones in the preceding section.
In nal position, /n/ and /s/ were well established (although /s/ was occasionally
deleted), /l/ was usually deleted but occasionally produced correctly, and some
allophone of nal /r/ ([, d], rarely [n]) was usually produced; Si used no words
containing a nal /d/. In addition, Si usually correctly pronounced /m#/ in the
English word home; in a few tokens the /m/ was deleted.
In clusters, /s/ and all liquids were deleted, with the exception of one production
of ste this [es ti] (2;5), and two productions of manzana apple [m sa na]
(manzana was usually produced as [sa na]). The nasal + stop clusters /mb/, /nt/,
/nd/, and /nk/ ([k]) were well established; if one member of the cluster was
deleted, it typically was the nasal in nasal + voiceless stop clusters and the stop in
nasal + voiced stop clusters (see also Ferguson 1977b). The cluster /ng/ ([g]) was
nearly always produced as [k], rarely as [g], once as [], and once as [k]. Si used
no words with the cluster /mp/. The rules governing consonant occurrence in
clusters and nal position were similar in the earlier and later periods.
6.3. Syllable structure
As was seen in Section 5, the variable nature of syllable deletion rules provided
some of the clearest evidence for the existence of favored word patterns during
the early period of 1;7 to 2;1 (Table 5.2). In the later period, however, Si had two
regular rules for reducing three- and four-syllable words to a shorter form:
delete the initial syllable(s), or delete the initial or nal consonants. Of the
approximately seventy words that she acquired between 2;2 and 2;5, only one
was reduced in a manner consistent with the earlier word pattern stage: pantaln
trousers [pa lon] or [bum po lon]. In addition, many words were produced
with the correct number of syllables. Only one word persisted in the form from
the earlier word pattern stage: tenedor fork [b
l
d]; this word was
produced frequently and always in the same labial + dental form.
To summarize, Sis productions from 2;2 to 2;5 were largely accurate realiza-
tions of the complex consonant and syllable structure of the words she knew.
Little evidence remained of the earlier stage where the realization of consonants
and syllables was constrained by a small number of word patterns. It is worth
noting that had Si only been seen beginning when she was 2;2 (or as early as
1;11), the phonological forms of words such as tenedor, restido, reloj, or
pantaln would have appeared anomalous; it is only within the context of Sis
earliest development that such words can be seen as an integral part of her
developing phonological system.
7. Discussion
In the past ten or so years, interest in child phonology has greatly increased
among linguists, mainly as a result of the inclusion of child phonology data in
the class of behavior for which a model of phonology must provide explan-
ations. As a result, many linguistic studies have recently appeared that docu-
ment the acquisition process for individual children and thus have contributed to
the description of phonology acquisition in general. Since phonological acquis-
ition has not been studied sufciently to separate developmental phenomena
reliably from those phenomena that reveal universal properties of phonological
structure, it is premature to make strong claims concerning the contributions of
phonology acquisition to general phonological theory. It is, however, appropri-
ate to ask what the study of an individual child in this case, Si may contribute
to what we know or hypothesize to be true of the acquisition of phonology.
Toward this goal, aspects of Sis data will be reviewed in this section as they
pertain to a general model of phonology acquisition (in particular the relevant
units of acquisition), universals of acquisition, and individual differences. It
is, of course, obvious that the validity of the following interpretation can only be
substantiated (or invalidated) on the basis of additional studies of many more
children acquiring a variety of languages.
Several recent papers have argued that acquiring phonology is basically a
cognitive process i.e., a problem-solving task (see in particular Kiparsky and
Menn 1977). At several points in the preceding sections, data were interpreted
as being explained best by such a model. Probably the most dramatic evidence
in these data of the child as the active organizer of phonology is Sis
phonemicization of the velar nasal, a phonetic segment that occurs in highly
restricted environments in adult Spanish but has no phonemic status. The theme
of the childs active role in acquisition will also gure prominently in the
following discussion of the units of acquisition. However, it will be in the
evaluation of individual differences that the problem-solving model will be
most useful.
Of central importance for any model of child phonology is the character-
ization of the units of acquisition. During the earliest stages (from 1;7 to 2;1),
Sis data showed several phenomena which could best be accounted for by
assuming a central role for the word as a basic unit being acquired. More
specically, the data argue for several levels of representation in her phonolog-
ical system. These levels correspond to two and possibly three basic units: the
word, the phoneme (i.e., some segmental unit larger than the feature), and
possibly the feature.
By word is meant a grammatical unit: in these data the word is the
morpheme and the morpheme is the word. A grammatical unit was needed to
describe the constraints on sequences of smaller units (which are best charac-
terized as phonemes) and to specify the domain within which phonological
processes operated (e.g., the simplication process of consonant harmony). In
addition, word boundaries were needed to block processes (with the exception
of some early sentences which were characterized by harmony across word
boundaries, Section 4) and to condition processes (e.g., word-initial versus
word-nal (but not necessarily syllable-nal) phenomena). Some such gram-
matical unit is frequently used in just these ways in descriptions of adult
phonologies.
However, the data support an even stronger claim regarding the word,
namely that the word and associated word-structure constraints are psycholog-
ically real. The evidence for this claim is that Si formulated hypotheses about
the nature of Spanish phonology on the basis of the similarity between words,
that she abstracted what were referred to here as word patterns, that she
expanded and generalized these word patterns to handle new words and that
she changed the output form of some old words as she learned new word
patterns (i.e., words that were similar in some ways underwent rules together).
The claim is then that the word was a basic organizational unit of her (early)
phonological development and that without considering the word as a unit,
some phonological phenomena would have appeared quite arbitrary and the
frequency and consistency of other phonological phenomena could not have
been easily explained.
Although the learning of words and word patterns, phonemes and features was
occurring simultaneously throughout this period of 1;7 to 2;5, the evidence for the
centrality of words and word patterns had largely disappeared by 2;1. In fact, the
data from 2;2 to 2;5 could be adequately described in terms of phonemic
contrasts, allophonic relationships, and phonotactic constraints. This change
suggests that the levels of representation constitute a hierarchy at least
developmentally in which words and word patterns dominate phonemes, and
phonemes in turn are more central than features. That the phoneme is a basic
building block of the acquisition process can be seen in Section 6; the evidence for
the feature is considerably weaker. Although, it is clear that Si could analyze
phonemes into component features (as seen for example in the way in which she
recombined features from different segments of the adult word), there is little
evidence that once having mastered particular features, she could generalize this
knowledge to the acquisition of a new phoneme. For example, the acquisition
of the features voicelessness and frication (|f, s|) and velarity (|k|) were not
sufcient for the acquisition of /x/ (cf. also Ferguson 1977b). The strongest
evidence for the status of the feature as a unit is the overgeneralization of the +
velar feature in initial position (word pattern) which resulted in the phonemiciza-
tion of an initial velar nasal a process that (if accurately stated) clearly involved
the analysis and generalization of features.
Such a developmental hierarchy would be consistent with the even stronger
claims regarding the word made by researchers who have studied children
somewhat younger than Si (Cruttenden 1971; Ferguson and Farwell 1975;
Menn 1971, 1977; see also Ferguson 1977a). The change in the hierarchy
seen at roughly 2;2 (but probably occurring earlier) would be consistent with
the Piagetian claim that at each successive period of development in the
acquisition of complex systems, previous skills are reorganized into a new
system of knowledge (Piaget 1952; see also Bower 1974 for a similar view of
motor skills). This knowledge of word-structure constraints is not lost, and in
fact must be substantially elaborated, for it is part of an adult speakers knowl-
edge of his/her language (see Greenberg and Jenkins 1964).
Clearly, a model of phonological acquisition should also describe the uni-
versal aspects of acquisition. In Sections 3 through 6, aspects of Sis develop-
ment that correlated with putative universals as claimed by various researchers
in child phonology were noted. Here, some of those aspects will be reviewed, in
order to place them within the context of phonology acquisition in general.
One of the most signicant aspects of child phonology is the simplication of
adult words to one or two syllables, typically composed of alternating singleton
consonants and vowels. Constraints on the length of words, and on the com-
plexity of co-occurrence of sounds in words, are probably universal. Smiths
four functions of child rules (1973) describe the major types of simplication
(see also Ingram 1974b), and these are seen in Sis data as well (Section 5).
Of the theories that deal with the universal order in which consonants are
acquired, Jakobson 1941/1968 is the most explicit. Although there are problems
with this theory (see Ferguson and Garnica 1975), the general claims regarding
the order of acquisition of classes of consonants t the general pattern seen in
Sis data and in several other studies of Spanish acquisition (Macken 1975,
1978). Si acquired the classes of stops and nasals before the class of fricatives,
and liquids were acquired last. Front consonants were acquired before back
ones, and voiceless before voiced. /l/ was the rst liquid acquired, with the
contrast between the r-phonemes acquired considerably later. However, /t/ was
acquired before the rst true fricative, contrary to Jakobsons prediction (see
also Macken 1975, 1978). Before being produced correctly, the fricatives were
replaced by the homorganic voiceless stops, back consonants by front ones, and
the r-phonemes usually by |l| a set of substitutions predicted by Jakobson.
When compared to data from other children acquiring Spanish, Sis data are
identical in at least the additional, following ways: /n/ was acquired in nal
position before /s/, with both before nal /l/ or /r/; learning the allophonic
relationship of stopspirant for the voiced stop phonemes proved difcult and
the earliest stage was one in which the voiced stops were usually produced as
glides and [] in some ways patterned with the liquids (see also Stoel 1974);
many aspects of the co-occurrence of consonants in words were determined by a
strength hierarchy in which voiceless stops (and nasals in the earliest stages) are
stronger than voiceless fricatives (Macken 1975, 1978). In addition, a wide
range of phonetic variation (Section 3) and the exceptional status of some
lexical items (e.g., Sis mira and guau guau, Section 4, and the frozen
forms in Section 6) were seen in all the Spanish subjects and are probably
universal. (From even so brief a summary, it is clear that it is at the level of
segment inventories that Si is most similar to other children acquiring Spanish
and other languages.)
Some aspects of Sis development appear to be characteristic only of Si
among the six subjects studied; other aspects are unique only in the frequency
with which they occur in her corpus. For example, all subjects imitated to some
degree at some point: Si imitated more than any other subject; J, another subject,
imitated rarely until the age of 2;1, when he began to imitate a great deal. Si
coalesced several learned routines into a single unit (e.g., qu es? what is
|kes| and unit phrases like mommy home); most subjects did this only with
donde est? where is (which was produced, as in Spanish baby-talk, as |n ta|)
and one subject never used such forms. Si was the only subject who produced
many misperceptions (Section 3).
As was pointed out before (Section 4), there are certain parallels between Sis
general language behavior and aspects of her phonological development, as for
example between the coalesced routines and the coalesced word patterns. The
misperceptions suggest a global, only partly differentiated auditory processing
which is paralleled by her loose, prosodic treatment of words. The pattern which
emerges is consistent in several ways if we view Si as a child whose preferred
processing mode is a global one rather than a detail or analytic one.
5
Within this
context of an information-processing model for interpreting individual differ-
ences (see in particular Zelnicker and Jeffrey 1976), the difference between Si
and a child like J as a global- versus a detail-processing child relates several
aspects of their language behavior with specic differences between their
phonological development (see Macken 1976). This view would predict that
the tendencies to scan an entire three- or four-syllable word for pattern-criterial
consonants, to use several different syllable reduction rules, and to have
substitution and metathesis rules sensitive to word position and pattern goals
would occur together and be of signicantly greater frequency in the phonology
of global children such as Si than in the phonology of nonglobal children like
J. This analysis of differences in terms of cognitive styles is not an unreasonable
extension of a cognitive model of phonology, one emphasizing the problem-
solving nature of acquisition, and is promising in that it may if successful
restrict the range of individual phonological differences to several sets or
syndromes typical of different styles.
In addition to the possible set of restrictions on individual differences
deduced from a set of basic preferences in processing styles, it appears that
the particular structure of the language being learned may also restrict the
number and type of individual differences among children learning that lan-
guage. Although Si is the only subject in the study to have such a strong
preference for word-initial labial
6
(and subsequently word-initial labial or
velar) word patterns, it may be that in part this particular preference reects a
particular property of Spanish: Hooper (1976) suggests that in Spanish labial
and velar consonants are considerably stronger than dental consonants in
syllable initial position. It seems reasonable to expect that the different ways
in which children organize their phonology will reect differential selection and
emphasis of particular aspects from the complete possible set of complex
relationships that obtain in the particular language being acquired (Macken
1976).
notes
1. Although we do not know how Sis parents pronounce this phrase, it is likely that
adults (in fast speech) also delete one of the two vowels; however, the other children
used [ke] when requesting information in comparable situations. La nia is another
example of a unit phrase (see footnote 4).
2. Although Si is chronologically older than the young children reported on by
Ferguson and Farwell and Menn, she is similar to these children in size of vocabulary.
Ferguson and Farwell use the rst fty words as the time domain for the phenomenon
they describe. Si was using a vocabulary of approximately fty-two words up to the
age of 1;10.15 and had only ninety-seven words by the age of 2;2 (n.b. as determined
by her productions during our experimental sessions). With respect to age and
vocabulary growth, she is similar to the other children in our study.
3. The gloss for this production could be libro book, in which case Si metathesized the
consonants to achieve the preferred output (see also sopa soup). This was one of
several cases in which an examination of Sis imitated corpus was of no help in
solving a problem occurring in the spontaneous corpus: Sis imitations of libro were
always closer to the adult form than her spontaneous production. It was clear from the
context that when she produced the form given in the table, she was referring to a
book, but whether the origin of her form was libro or librito could not be determined.
4. At about 1;9, Si began producing nia girl as na ni na (later, the palatal nasal was
correctly produced); the adult form of la nia the girl was evidently lexicalized as a
unit. Nio boy was also produced with an initial extra syllable na. Since Si often
neutralized nal vowels, it was frequently not clear what gender was intended; she
would, however, produce a nal schwa in imitation of either nia or nio. This word
|na ni na/o| was very common and was regularly produced with three syllables. No
other word was lexicalized with the article into one unit in this way.
5. Peters (1977) reports on the language of a somewhat unintelligible child who used
two types of speech, analytic and gestalt. Peters suggests that these two types of
speech may reect two language-learning strategies that may be used to different
degrees by different children. This contrast of analytic versus gestalt is similar in
some respects to the detail versus global cognitive style dichotomy of Zelnicker
and Jeffrey (1976).
6. Montez Giraldo (1970:488) reports that Cuando en la palabra hay una labial, es
frecuente que se produzca una mettesis que tiene coma resultado iniciar la palabra
con la labial. However, six out of the eight examples that the author gives are from
only one child (Emilia 2027 months) of the four that he studied. This child (see also
Montez Giraldo 1971) metathesized both dental + labial and velar + labial sequences
but apparently no others (e.g., zapato patato; camisa manika, 1971: 339).
References
Alarcos Llorach, E. (1950). Fonologa espaola (3rd edn. 1961). Madrid: Editoral
Gredos.
Bloom, L., Hood, L., and Lightbown, P. (1975). Imitation in language development: if,
when and why. Cognitive Psychology 6, 380420.
Bower, T. C. R. (1974). Repetition in human development. Merrill-Palmer Quarterly,
20, 3038.
Bush, C. N., Edwards, M. L., Luckau, J. M., Stoel, C. M., Macken, M. A., and
Petersen, J. D. (1973). On specifying a system for transcribing consonants in
child language: a working paper with examples from American English and
Mexican Spanish. Stanford: Stanford Univ., Dept. of Linguistics.
Cruttenden, A. (1970). A phonetic study of babbling. Brititish Journal of Disorders of
Dalbour, J. B. (1969). Spanish pronunciation: theory and practice. New York: Holt,
Rhinehart & Winston.
Farwell, C. B. (1976). Some strategies in the early production of fricatives. Papers and
Ferguson, C. A. (1977a). Learning to pronounce: the earliest stages of phonological
development in the child. In F. D. Minie and L. I. Lloyd (eds.), Communicative
competence and cognitive abilities early behavioral assessment. Baltimore:
University Park Press.
(1977b). New directions in phonological theory: language acquisition and universals
research. In R. W. Cole (ed.), Current issues in linguistic theory. Bloomington:
Indiana University Press.
Ferguson, C. A. and Farwell, C.B. (1975). Words and sounds in early language acquis-
ition. Language, 51(2), 41939. Reprinted in this volume as Chapter 4.
Ferguson, C. A. and O. K. Garnica, (1975). Theories of phonological development.
In E. H. and E. Lenneberg (eds.), Foundations of language development, vol. 2.
Ferguson, C. A., Peizer, D. B., and Weeks, T. E. (1973). Model-and-replica phonological
grammar of a childs rst words. Lingua 31, 3565.
Fromkin, V. A. (ed.), 1973. Speech errors as linguistic evidence. The Hague: Mouton.
Greenberg, J. H. and Jenkins, J. J. (1964). Studies in the psychological correlates of the
sound system of American English. Word, 20(2), 15777.
Hooper, J. (1976). An introduction to natural generative phonology. New York:
Academic Press.
Ingram, D. (1974a). Fronting in child phonology. Journal of Child Language, 1, 23341.
(1974b). Phonological rules in young children. Journal of Child Language, 1, 4964.
A. R. Keiler. The Hague: Mouton. (Originally published as Kindersprache, Aphasie
und allgemeine Lautgesetze. Uppsala: Almqvist & Wiksell.)
Kiparsky, P. and Menn, L. (1977). On the acquisition of phonology. In J. MacNamara
(ed.), Language learning and thought. New York: Academic Press.
Labov, W., Yaeger, M., and Steiner, R. (1972). A quantitative study of sound change in
progress, vol. I. Philadelphia: US Regional Survey.
Macken, M. A. (1975). The acquisition of intervocalic consonants in Mexican Spanish: a
cross-sectional study based on imitation data. Papers and Reports on Child
Language Development, 9, 2942.
(1976). Individual differences in phonological acquisition: strategy versus cognitive
style. Paper presented to the Child Language Seminar Series, Stanford University,
May 1976.
(1978). Permitted complexity in phonological development: one childs acquisition of
Spanish consonants. Lingua, 44, 21953.
(1974). A theoretical framework for child phonology. Paper given at the Summer
Meeting of the Linguistic Society of America, Amherst, MA.
(1976). Evidence for an interactionist-discovery theory of child phonology. Papers
and Reports on Child Language Development, 12, 16977.
(1977). Phonological units in beginning speech. In A. Bell and J. B. Hooper (eds.),
Syllable and segments, pp. 15771. Amsterdam: North-Holland.
Montez Giraldo, J. J. (1970). Dominancia de las labiales en el sistema fonolgico del
habla infantil. Thesaurus, 25(3), 4878.
(1971). Acerca de la apropriacin por el nio del sistema fonolgico espaol.
Thesaurus, 26, 32246.
Moskowitz, A. I. (1971). The two-year-old stage in the acquisition of English phonology.
Language, 46(2), 42641.
(1973). The acquisition of phonology and syntax: a preliminary study. In
K. J. J. Hintakka, J. M. E. Moravcsik, and P. Suppes (eds.), Approaches to natural
language. Dordrecht: Reidel.
Peters, A. (1977). Language learning strategies: does the whole equal the sum of the
parts? Language, 53, 56073.
Piaget, J. (1952). The origins of intelligence in children. New York: International
University Press.
Smith, N. V. (1973). The acquisition of phonology, a case study. London: Cambridge
University Press.
(1975). Universal tendencies in the childs acquisition of phonology. In N. OConnor
(ed.), Language, cognitive deciencies and retardation. London: Butterworths.
Stockwell, R. P. and Bowen, J. D. (1965). The sounds of English and Spanish. University
of Chicago Press.
Stoel, C. M. (1974). The acquisition of liquids in Spanish. Unpublished PhDdissertation,
Stanford University.
Vihman, M. M. (1978). Consonant harmony: its scope and function in child language. In
J. H. Greenberg, C. A Ferguson, and E. A. Moravcsik (eds.), Universals of human
language. Stanford University Press.
Waterson, N. (1971). Child phonology: a prosodic view. Journal of Linguistics. 7,
Zelnicker, T. and Jeffrey, W. E. (1976). Reective and impulsive children: strategies of
information processing underlying differences in problem solving. Society for
Research on Child Language Development Monograph 168.
6 Development of articulatory, phonetic,
and phonological capabilities
Lise Menn
I. Introduction
A. Background
A rather large body of information about the early stages of the acquisition of
the phonology of English and some other languages (Spanish, Mandarin,
Thai) has become available over the last decade or so, and the theory of
acquisition of phonology has not only grown, but has changed its nature
considerably. Since about 1974, we have moved away from a model in
which phonological development was considered to resemble the differentia-
tion of an embryo. In its place we have evolved a notion of the young child as a
creature of some intelligence who is trying to solve a problem: the problem of
sounding like her companions when communicating with them. This shift of
model took place as more diary and small-group studies were published, and
in the context of Slobins similar approach to the acquisition of morphology
and syntax (Slobin 1966, 1973).
In recent years, the study of child phonology has also become distinctly more
psychological in the explanatory concepts that it employs. This is largely
because the richer data base has made it possible to see a considerable range
of individual differences among children. Faced with such diversity, we have
had to look below the surface for an underlying unity; and in doing so, we have
begun to invoke notions of processing and storage of information in addition to
the linguistic notions of articulatory control and phonemic contrast.
In this chapter, I will review the strategies that children are presently believed
to use in acquiring phonology, and I will give an account of the psycholinguistic
model of early phonology which I think is presently the most adequate. For
more extensive discussion, in addition to the references which will be cited, the
reader should see the many important papers collected in Yeni-Komshian,
Kavanagh, and Ferguson (1980).
My grateful thanks to Sarah Hawkins, Paula Menyuk, and Ronnie Wilbur, who spent considerable
time and effort working over the rst draft of this chapter. At the urging of Charles A. Ferguson, and
with the help of Prof. Hawkins, I have attempted to overcome old habits and use IPA consistently
throughout it.
168
B. Plan of exposition
We will begin this summary of the acquisition of phonology by looking at the
transition from babble to speech in Section II. This is necessary so that we can
understand the problems of dening which early vocalizations a theory of child
phonology attempts to account for.
Then we shall undertake the construction of a model of child phonology that
will allow us to deal separately with three different kinds of information that the
child is acquiring: (1) knowledge of howwords sound, (2) knowledge of howto
pronounce them, and (3) knowledge of allomorphy or abstract phonology,
manifested as the relationships among words or morphs that sound somewhat
different but are the same in meaning.
In Section III.A, we shall discuss the childs perceptual knowledge of the
sounds of language; in Section III.B, we will pause to discuss problems with the
notion of phoneme in early stages of phonological development. In Section
III.C, we will turn to the traditional subject matter of child phonology: the ways
in which children pronounce the words of the adult language (Section III.C.2).
We will see, however, that modifying the pronunciation of a word is only one
type of reaction to the complexities of adult phonotactics; the other type of
reaction is the avoidance of particular sound patterns (Section III.C.1). These
behaviors can be unied, at least in the early stages of acquisition, by the formal
descriptive device of saying that children obey phonological output constraints
(Section III.C.3), and data from a variety of children are presented which
support this description.
In Section IV, we consider howa child may go about inventing rules to derive
her output forms fromthe forms spoken by adults; Section IV.Aextends the idea
of ease of articulation by including skill already acquired at a point in time
as a factor affecting the ease of a new sound. In Section IV.B, the notion of
naturalness in child phonology is discussed, and we consider how it can be
related to our account of rule-creation. In Section IV.C, we note the non-natural
rules that children can also create, including those which appear to stem from a
dim awareness of the fact that allomorphy exists in the adult language. Section
IV.D describes data on rule origin and growth, and Section IV.E concerns rule
overgeneralization.
Section V presents the notion that early phonological development should be
viewed as the development of skill in the ability to program and execute
complex motor sequences. It begins by noting the theoretical importance of
the two irregular phenomena that are most difcult for conventional approaches
to deal with: overgeneralizations of rules of child phonology (Section V.A) and
phonological idioms (Section V.B). Then the regular pattern of the arrangement
of childrens early words into families of canonical forms is recalled in Section
V.C. Section V.D attempts to account for these three fundamental types of data
within a unied two-stage model of articulatory motor programming.
Development of articulatory, phonetic, and phonological capabilities 169
In Section V.E, we see how this model can be tted into the overall picture of
child phonology that was set up in Section II.A; we ex its explanatory muscles
in dealing with rules and rule changes, and we consider some of its conceptual
limitations in subsection V.E.3 entitled Caution: The limitations of the pro-
gramming metaphor.
In Section VI, we deal with some difcult logical and methodological topics:
the relation of imitated to spontaneous productions and the nature of chil-
drens metalinguistic ability to focus on pronunciation as a task (Section VI.A).
These are related to the perennial problem of why some sounds may appear in
babble but not in speech (Section VI.B). (Section VI may be considered
logically prior to most of the rest of the chapter.)
Section VII, The acquisition of allophones and allomorphs, turns to the
other major branch of developmental phonology, and gives a brief outline of this
topic, especially as it relates to questions of psychological reality. The reader is
referred to MacWhinneys (1978) monograph on this topic, since including a
full account of it would double the length of the chapter.
Finally, Section VIII lists the major ndings of the past decade of research in
developmental phonology, recalls the motor programming model for the begin-
ning stages of acquisition of phonology which was proposed in Section V, and
briey contrasts the working assumptions of the current approach with those of
the preceding Jakobsonian era.
Note: Some longitudinal studies will be cited repeatedly in this chapter. For
convenience, unless otherwise noted, references to Hildegard are from
Moskowitz 1970b (originally, of course, from Leopold 1939), to Daniel
from Menn (1971), to Si from Macken (1979); to Jacob from Menn
(1976a, b); and to Amahl from Smith (1973).
II. The transition from babbling to speech
We can usually assume that phonology deals with sound patterns of words, but
even in adult languages we must decide whether the phonology that we write
should attempt to include certain marginal items, for example, onomatopoeic
representations of animal cries and noises. In studying early child phonology,
this problem, marginal in dealing with most adult languages, becomes central.
There is no ready-made solution to it; in this section, I will just attempt to show
the nature of the difculty.
There seem to be three denable types of utterances found during the
transition period that we call the onset of speech: sound-play, protowords,
and modulated babble. Modulated babble refers to the use of strings of sounds
which appear to carry meaning only by their intonation contour.
This is also called jargon and it can be very eloquent and effective vocal
communication. Since our concern is with the development of articulation, we
will not discuss modulated babble further here; the reader is referred to von
Rafer-Engel (1973) and Menn (1976a, b).
170 Lise Menn
Sound-play, which may include word-practice, is not communicative behavior;
in other words, when we classify an utterance as sound-play, we do so because
there is no indication of any association between recurrent context and recurrent
sound-play patterns. One can of course say that sound-play is expressive of a
cheerful mood, but in that weak sense, any evidence of mood is communicative.
Joint sound-play is another matter; it is certainly communicative action, but it
seems to be absent or rare in adultchild pairs in our culture when the child appears
mature enough to be on the threshold of speech, although it is certainly found with
young infants (see Sterne, Jaffe, Beebe, and Bennett 1975; Snow 1977).
Proto-words are articulated meaningful utterances; some of themare directed to
others (one can tell because the child gets annoyed if no one responds), and some
are solo performances. These are our objects of study, for only here can we be
certain that the child is trying to say a word that is, trying to match a desired
perceived target. And again, we judge that they are meaningful because of a
recurrent association between sound and situation (although obviously if what
appears to be a clear token of an adult word is uttered just once in a context for
which it is strikingly appropriate, it is usually included as a meaningful utterance).
A child may have all of these utterance types for a period of several months.
Some utterances, furthermore, may contain elements that belong to more than
one class: for example, a child may start playing sound-games with a real
word (Weir 1962; Menn 1976a), or he may address one with an utterance that
has a real word or two embedded in modulated babble (Jones 1967). And of
course, in practice, some utterances are hard to classify, since classication
depends in part on surmising the childs intent.
The important point here is that clear cases can easily be found, and that a
child may have one, two, or all three of these utterance types for a period of
many months. The silent period, despite the emphasis given to it in the older
literature, is a rare phenomenon.
There is a fourth type of utterance that we should mention. Some childrens
early attack on language proceeds by global approximation to long phrases
rather than by attempts at single words or short phrases. Their early efforts at
speech are characterized by variable and often loose articulation which is
extremely hard to transcribe; Ann Peters (1977) dubbed these children mush-
mouth kids. In this chapter, we shall consider only children who take the more
segmental word-by-word approach to phonology; the reader who is interested in
the global approach should see the Peters article and also Branigan (1979).
Proto-words now need to be dened more carefully. They are vocables
(articulated utterances) which recur in denable contexts. One might fear that
this notion of recurrent denable contexts would be very difcult to use, but it
generally is not, because a one-year-olds activities tend to fall into identiable
behavioral routines, some solo and some partnered. These include favourite
manipulations on objects (putting things into things), games (peekaboo), direct-
ing an adults attention (pointing), obtaining things (requesting/demanding),
offering things, greetings, farewells, and so on. Halliday (1975) describes such
pairs of vocalization and behavioral routine in elegant detail; see also Menn
(1976a) and Clumeck (1977). The meaning of a proto-word is originally very
limited, and is best characterized as what you say when you do X. Proto-
words may thus usefully be considered as one type of vocal signal; they are not
yet symbols, because each of them is bound to the performance of its routine; it
cannot be used freely in new contexts. At some point, however, rst singly and
then more rapidly, some of the proto-words start to be used in more situations,
and thus they begin to acquire the symbolic autonomy of the true word. For
example, a woof-woof vocable may be initially used only when a child is
pointing to a picture of a dog; then it may be generalized rapidly to pointing to
real and toy dogs, and yet it may take months to become usable in requesting a
toy dog. Incidentally, proto-words do not have to have adult words as models
(Halliday 1975), and some without adult models may even make the transition
to becoming true symbols (Menn 1976a, Menn and Haselkorn 1977).
Proto-words are, by denition, the rst units for which a child is trying to
produce a particular articulated sound pattern for communication (always
excepting the whole-phrase efforts of the mush-mouth kids). If we wish to
make generalizations about the childs rst phones, or to evaluate the applic-
ability of terms such as phoneme to the onset of speech, we must look into the
period when proto-words are rst being produced. Sometimes what we see is a
handful of nicely dened CV(CV) shapes, as tradition would have it: [papa],
[mama], [dada]. A good example is given in Ferguson, Weeks, and Peizer
(1973). But more often, apparently, the early picture shows quite a mixture of
forms: some vowelless items, perhaps, such as [m:: ], or Hildegards [::]; some
traditional CV(CV) shapes and/or some (C)VC and VCV shapes; perhaps an
isolated word with a consonant cluster (Hildegard, again); and some wildly
uctuating forms that seem to originate from rather complex adult target words
(e.g., Jacobs renditions of thank you, which showed an endless variation
including [deig], [geigu], [gigo] [g:do], [dejo], [dido], [dt], [it]).
Summarizing this section: the transition from babbling to speech is typically
gradual, and may involve any combination of four types of utterances: sound-
play, modulated babble (using meaningful or possibly meaningful intonation
contour), whole-phrase efforts, and proto-words. Proto-words are meaningful
utterances with phonetically denable targets; however, the phonetic denition
may be quite loose by the standards of adult phonetic target-matching and the
meaning may be very limited and situation-bound. We will take child phonol-
ogy as beginning with proto-words, and in Section III.B we will examine the
problem of applying adult-based phonological concepts to these rst words.
III. Constructing a model of early phonological knowledge
In this section, we will undertake the description of some aspects of early
phonological knowledge. This includes what children, in the rst months of
speaking, seem to know about the sounds of words in adult language
172 Lise Menn
(perceptual knowledge), about the relations among those sounds (phonological
knowledge, including knowledge of segmentation and phonemic contrasts),
and about how to pronounce words.
The most striking fact about early child words has always been how sim-
plied most of them are compared to their adult models. What has made child
phonology an object for study has been three realizations about these simpli-
ed forms: that there are generally systematic relations among a given childs
words, that there are generally systematic relations between the childs word
and the adult model word, and that it is possible, by comparing children who
have very different ways of dealing with adult words, to come up with a general
theory of why and how these generally systematic relations exist. These three
realizations will be developed in this section and in the two which follow.
Note: Beginning in this section, I will occasionally draw small ow-chart
diagrams in order to keep track of the various capacities for processing and
storage that we postulate in order to account for the childs language behavior. It
is important to keep in mind that the entities and processes represented by these
boxes and arrows are only hypothetical constructs, and that even the best
guesses among them must be grossly oversimplied compared to whatever it
is that we have in our heads.
III.A. The input lexicon: representation of the adult word
Lexicon is a word whose precise meaning varies from user to user, but it at
least denotes a collection of stored, accessible, memorized bits of information
about the sounds and meanings of words and/or their component meaningful
parts. We must grant that something which should be called a lexicon exists in
the human individual; that is, there must be some form of long-term storage
containing at least a sketchy encoding of the sound pattern and meaning which
is accessible when we recognize and understand a word.
In order to say a word spontaneously and meaningfully, one must also have
access to stored information about how it sounds and what it means; a standing
controversy is whether this knowledge is best represented by postulating a
separate output lexicon or whether both recognition and production informa-
tion are best conceived of as being in a single lexicon (Butterworth 1983: chs. 6
and 7).
To advocate a single lexicon in a psycholinguistic model of child phonol-
ogy is to hypothesize that the rules which create the childs output form from
her input form operate in real time; to advocate a two-lexicon model is to
claim that a form closer to the output form is also stored and that this second
form is used as a basis for production. Much of the data that we will consider
can be handled more gracefully in a two-lexicon model than in a one-lexicon
model; I think the two-lexicon model is likely to be a better approximation to
what we really utilize in speaking, and so I will use it in this chapter. It is by no
means universally accepted as the superior model (cf. N. V. Smith 1978),
however, and formally all the data that it handles can be managed in a one-
lexicon system, by the use of markings on each lexical entry specifying
which rules apply to it in the event of competing rules applying to the same
domain.
We shall say, then, that two forms may be stored for each word: a recognition
form and a production form. The collection of words (formmeaning pairs) that
a speaker can recognize and understand is called the input lexicon; it could
equally well be called the recognition lexicon or the passive lexicon. The
collection of words that a speaker can use (that is, the information necessary to
use them meaningfully and to pronounce them) is referred to as the output
lexicon, but could also be thought of as the active lexicon. (This active/
passive dichotomy is usually thought of as a matter of knowledge of word
meaning rather than pronunciation, but the extension of it to include knowledge
of pronunciation seems to capture the right distinction.) So far, then, we have the
rudimentary diagram shown in Figure 6.1.
Let us explore the properties that can be ascribed to the input lexicon. We
knowthat speech perception is an active process: the hearer lters and structures
the incoming sound. Several researchers, including Waterson (1970, 1972),
Ingram (1974), Hawkins (1973), Macken (1979), and Wilbur (1981), have
called attention to the possibility that a child may not succeed at rst in getting
a complete picture of a word he has begun to learn. Therefore, we may be more
accurate in particular cases if we represent the childs knowledge of some part of
the words sound pattern by noise (Ingram 1974) or by underspecied
phonemes (archiphonemes, macrophonemes). These are useful notational
devices whenever we have reason to believe that, for example, a child has not
gured out what sounds are present in the unstressed syllable of a word or has
been unable to tell which of several fricatives a word ends with. To be more
explicit, these devices are useful notations whenever the child apparently cannot
distinguish perceptually among particular sets of similar words.
Note that we cannot rely on the childs pronunciation to let us know what
perceptual distinctions she is making, for children can in fact frequently tell the
difference between two words while they are still unable to pronounce either
one of them. ([Ronnie] Wilbur points out in personal communication that in
adults, cross-dialect phenomena continue to give examples of perception out-
stripping production: American Midwesterners who do not distinguish among
[Collection of percepts/understandings]
[Input lexicon]
[Output lexicon]
Figure 6.1.
174 Lise Menn
/,,e/ before /r/ in their speech nevertheless can reliably distinguish merry,
marry, and Mary in the speech of those who do make the distinction.
1
To give two simple examples of the use of these notations for incomplete
phonetic input information: suppose that a certain child appears unable to
distinguish between two words which differ only in the shape of a pretonic
syllable, such as along and belong, but that she can distinguish them from long.
Then noise marker would be appropriate to represent the rst syllable of
iambic words in the input lexicon.
Now suppose that we have a child who cannot distinguish /bs, b, bf/
from one another at an above-chance level in an appropriate test situation, but
who can tell them from /bt, bv/. Here, the child has some knowledge of the
nal sound of, say, bath, so we would not use a noise marker. Instead, we would
say that bath is entered as /b(unvoiced fricative)/ in the childs input lexicon.
So, what we have been saying is that the childs ability to use acoustic features
to discriminate meaningful words is typically well ahead of her ability to control
those features for making contrasts in production, but may well be inferior to the
linguistic discrimination ability of the adult. Some discrimination which the child
appears to make may in fact be carried out partly on the basis of extra-linguistic
information and linguistic context. For, like all of us, a childs ability to hear is
conditioned by her expectations of what she is about to hear. This factor is
important to emphasize for two reasons. One will be discussed in Section VI,
where we will explore some implications of Bartons (1976) work which shows
that unfamiliar words in minimal-pair tests of discrimination ability tend to be
misheard as familiar ones. This biases the tests and increases the difculty of
ascertaining what the childs input lexical representation of a word really is.
The other reason for bringing up the notion of the childs expectations is the
following phenomenon: Macken (1979) and Platt and MacWhinney (1983)
have argued that we sometimes have good evidence for the following sequence
of events. First, a child learns to recognize the sounds of a word adequately but
cannot produce it very well: we say that the input representation is good, but not
the output representation. Usually, the child will then slowly bring the produc-
tion into line with the target, but in certain cases, expected improvements fail to
occur in particular words or sets of words. The child maintains his old pronun-
ciation in such a way that it seems that he is no longer even trying to match the
adult model. Instead, it seems that he has replaced his original input represen-
tation with a new one which is based on his own output. For example, Macken
(1979) gives this analysis for certain events reported by Smith (1973). Amahl,
his subject, produced the word take as [geik] at an early stage, using a general
velar assimilation rule (a type of rule which we will shortly be discussing in
some detail). The rule stopped operating for all other words by Smiths stage
14, but Amahl retained a velar-harmonized form for take until stage 22, and
even created a participle [kukn] for taken at stage 18.
Now, if a child maintains his own form when he is capable of improving it, it
must mean that he has temporarily stopped monitoring, stopped really listening
to himself and/or to the adult model. He expects that he is correct, and does not
bother to check up. Indeed, many of us have adult acquaintances who have an
idiosyncratic pronunciation of some word, and who seem quite unaware that
they are not speaking as other people do. Many irregularities in childrens
phonological behavior thus seem to be explainable in terms of the biasing of
perception by expectation.
III.B. Segments, phones, and phonemic contrasts
Now we will consider the early stages of the production of proto-words and
words. Early child speech is often called pre-phonemic (Nakazima 1972;
Menyuk 1977). There are very good reasons for this. One is that phonemic
contrast and phonetic control do not develop in synchrony. One example of this
sort of uneven development can occur when a child honors a contrast without
being able to handle the relevant phonetics at all. So we may nd a child who
renders the voicing contrast in word-nal position by deleting voiced nal stops
and producing the unvoiced stops as a glottal stop. In such a case, for example,
the pair bead, beat would be rendered as the pair [bi, bi]. This hypothetical
child has preserved a phonemic contrast without being able to produce either
adult phone involved.
The converse case can occur as well: phonetic control can develop ahead of
phonemic contrast. It is very common for all initial stops, regardless of target, to
be produced by a child learning English as voiced (more precisely, to have
voicing onset time between 0 and 20 msec; see Macken and Barton 1980). In
such a case, the phonetics of voiced (short-lag VOT) initial stops could be under
control, but not the phonetics of unvoiced (long-lag) initial stops. One could
correctly say that the child at this stage had acquired the phones [b, d, g], but it
would be quite wrong to say that she had acquired the phonemes /b, d, g/ since
she does not have the contrast between them and /p, t, k/. (For further discussion
with examples, see Moskowitz 1975.)
The second reason why the concept of phoneme is difcult to apply in the
early stages of language development is that for many children, minimal pairs
(pairs of words differing only by the contrast in question) are so rare as to make
statements about the presence or absence of contrast impossible (see Itkonen
1977).
And the third good reason for calling early speech pre-phonemic is even more
linguistically unsettling. At least we can speak of phones in the rst case above,
and nothing prevents us from doing so in the second case. That is, we appear to
have phonetic targets which are comparable to one another, independent of the
lexical items the particular words in which they are located. In adult
language, we expect that any difference between, say, an /a/ in one word and
an /a/ in another will be completely due to the sounds surrounding them, the
stress pattern, and possibly to some kinds of morphosyntactic factors (e.g.,
being used as a clitic) or more social factors (formality, rate). We are not
176 Lise Menn
prepared to see arbitrary variation in phonetic targeting between one lexical
item and the next. Yet it does happen; it even occurs in adult language in special
marginal cases.
Let us rst consider a special case in adult English where a segment fails to
satisfy the criteria for being a phone. The o of no is subject to a huge amount
of variation in realization because of the expressive roles it plays; it can occupy
almost all positions in the English vowel space below a diagonal from [] to
[o], including for example [, a,
], and [] as well as the citation form [o]. We

must therefore record as a lexical fact about the word no the colors its vowel
would take in other words, we cannot describe the vowel in no as the phone
[o], and if we insist on saying (for good reasons outside the scope of this
chapter) that it is still the phoneme /o/, we must have a special marking in the
lexicon preventing this /o/ from having its usual phonetic spelling-out as [o] in
certain usages.
The child phonology case to be cited here, fromJacob, parallels the adult one;
the problem is caused by inconsistencies in the amount of variation found for
what should be two instances of the same phone. Jacob produced many tokens
of the targets down and round, both favorite action words. The vowels of the
two words differed in output: the renditions for down were much more variable
than those for round. But there was no reasonable way to ascribe this difference
to phonetic conditioning or to any of the other factors just cited as causing
variation. Thus, these two segments could not be considered tokens of the same
phone.
Similar problems in the denition of consonant phones were noted by
Ferguson and Farwell (1975), and contribute to Fergusons repeated sugges-
tions that the earliest productive stage of language acquisition should be
considered a lexical acquisition period rather than a period of acquisition of
primitive phonemes. In this chapter we will be working towards a compromise
model that allows for both the idiosyncratic properties of segments in particular
words and the general properties of those segments which do seem to be
comparable from one word to the next.
III.C. Strategies for dealing with words and sounds
There seem to be a number of strategies that children may draw on as they try to
render adult words within their limited articulatory abilities. Two types of
strategies have been clearly identied in the literature to date. The rst type
induces little distortion in the model word, while strategies of the second type
tend to modify it considerably. Most children probably draw on all of these
strategies to varying degrees. However, some of them rely quite heavily on
those which do little violence to the model word, while other children show no
compunction about making gross changes in a fair number of the words that
they attempt. (It has often been speculated that this is a matter of cautious vs.
bold temperament on the childs part, but to date there has been no systematic
attempt to compare phonological behavior with any aspect of personality, or
even with the strategies chosen for acquiring any other aspect of language.)
III.C.1. Non-distorting strategies: avoidance and exploitation The
non-distorting strategies, which may also be termed selection strategies, are
(a) avoidance and (b) exploitation of favourite sounds.
(a) Avoidance. By avoidance we mean that the child does not even attempt to
say words containing certain adult sounds. The conrmation that this phenom-
enon can exist in normal children as young as 15 months old, and not merely in
the older child who has required articulation therapy, is a matter of major
importance on both linguistic and psychological grounds. Linguistically, it is
important because it lies entirely outside the range of behavior considered by
Jakobson and requires the construction of additional acquisition theory (see
Ferguson and Macken 1983). Furthermore, it provides one of the clearest
demonstrations of the fact that perceptual discrimination can precede produc-
tion by many months; if there are two similar sounds and one is avoided while
the other is attempted, the child must be able to discriminate between the two
sounds while being able to make only one of them. Psychologically, avoidance
is a stunning phenomenon because it implies considerable metalinguistic aware-
ness on the part of a child who has only recently begun to speak. After all,
avoidance must be the result of a kind of decision.
Consider a child who imitates and uses a set of words beginning with, say, /d/,
but who will not attempt any with /b/ even though he has demonstrated
comprehension of b-initial words like ball, block, box, and so on. At the very
least, such a child must have the feeling that there is something special the
matter with b-initial words, some reason why he does not want to say them.
Ferguson and Farwell (1975) suggested that this might be happening in some of
their subjects; Menn (1976a) was able to demonstrate the b/d case just cited for
Jacob, including showing that the child knew the meanings of a good number of
b-initial words; and Schwartz and Leonard (1982) showed that avoidance could
be demonstrated experimentally in children near the onset of speech (having
fewer than 50 words), although not in somewhat older ones (Leonard, Schwartz,
Folger, and Wilcox 1978).
(b) Exploitation of favorite sounds. Some children early in their speaking
lives seem to seek out adult words that contain particular sounds and add these
words preferentially to their output, although they learn other words as well.
Farwell (1976) is the rst study to document this strategy; her case, from the
collection of the Stanford Child Phonology Project, was a little girl who
apparently especially liked fricatives and affricates, for her output was loaded
with words like juice, choo-choo, shoes.
It is clear that both avoidance and exploitation are strategies that we should
expect to nd if a child is, in fact, treating the mastery of pronunciation as a
problem to be solved, and is capable of avoiding perceived areas of difculty
and of capitalizing on perceived areas of success.
178 Lise Menn
III.C.2. Modication strategies: rule use Now let us consider modica-
tion strategies, those which result in changes to the shape of the word. One case
has become familiar: the case of rule use. Here, the child has a systematic
method of dealing with adult words, one that can be described by a set of rules
for substitution, omission, and occasional metathesis of the sounds of the adult
word. First we will consider some typical examples of this well-studied type of
modication strategy, and then, in Section III.C.4, we will study some more
unruly modications.
Child-phonology rules represent the childs modications of the adult model
word in a segment-by-segment fashion. They are usually written as direct maps
fromthe adult sound to the childs sound. When the rules are written this way, of
course, a step is left out: the psychologically intermediate but inaccessible step
of the childs internal recognition encoding of the adult model word which we
just discussed in Section III.A. For the present, we will write rules without that
intermediate step; when we discuss the construction of a psychological model
for child phonology in Section V, we will put it back in again, and also
hypothesize some other intermediate processing levels.
To begin with, let us consider a hypothetical child near the beginning of
speech who has the following list of words:
hat [
]; boy [bj]; cat [k
]
nice [naj]; house [w]; dog [da]
please [pi]; blue [bu]; clock [ka]
drum [d]; up []; down [dw]
This child would appear to substitute glottal stop for nal /t/ and to delete other
nal consonants. Initial /h/ is also deleted. Liquids are dropped from consonant
clusters. These statements may be translated into formal terms like this:
t [] (/t/ becomes glottal stop and then all other consonants
are deleted word-nally)
C |_#
[+cons] (liquids are deleted from initial clusters)
[+ voc ] |#C_V
/h/ |_# (h is deleted word-nally)
The reader may have noted that these four rules are not the only ones that can be
devised to describe the observed behavior. It is important to understand that in
most cases we do not get enough different words from a young child to
determine her set of rules fully. Rules are always to be regarded as the analysts
tentative hypotheses about the childs mental operations. And it is also impor-
tant to remember that a rule is no more than a description of a hypothesized
regularity of behavior. It is not an explanation of anything to say that a child
has a deletion rule or a substitution rule, just as it is no explanation to say that
an apple falls because of gravity.
Now let us examine in more detail two of the best-known rule types of child
phonology: assimilation and voicing/devoicing. Towards the end of this section
we shall also see that there are other strategies that children use which produce
the same effects that these rules do.
(a) Assimilation and consonant harmony: assimilation rules. We often notice
that young children have rules which change the consonants in a word to make
them more similar to one another. As in general phonology, these are called
consonant assimilation rules. For example, a child who can say daddy with a
good initial [d] and egg with a good nal [g] may yet say [gg] for dog. Such a
child usually also says [gk] for duck and truck, etc. These rules may be so strict
for a time that all the consonants in any given output word must be homorganic
that is, made with the same position of the articulators. Boat, for example, would
have to be produced as either [bowp] or [dowt].
Assimilation involving the feature [nasal] is common, too, in child phonol-
ogy: dance may become [nns], with the [d] assimilating in nasality to match
the following nasal; or meat may become [dit], with the [m] losing both its
nasality and its labial position as it assimilates to the nal [t]. (Both of these
forms are from Daniel; Menn 1971.)
Sometimes a child may produce some non-harmonic sequences and yet
apparently require harmony in other words: he may say gate correctly, but
produce [gg] for big and [gejk] for take. In this case, the assimilation of labials
or dentals to velars occurred only if the velar was word-nal; if it was word-
initial, both stops were produced correctly. Relative position of the consonants
in a word is often a factor when some sort of asymmetry of consonant harmony
is found (Ingram 1974). Vihmans (1978) survey suggests that sounds at the
beginning of a word are somewhat more likely to be the ones which are changed
when there is an assimilation rule, but this is merely a tendency.
Assimilation rules can be found in great numbers in adult language as well,
but there is an important difference. In adult language, the usual type of
consonant assimilation is contact assimilation: a segment changes and becomes
more like one that is next to it. Although many adult languages have vowel
harmony, which occurs even when consonants lie between the vowels, very few
adult languages have consonant assimilation at a distance; Vihman nds it in
only 3 of the 88 languages in the Stanford Phonology Archive (not including
some cases in which the intervening vowel is colored by nasalization in nasal
harmony, or by pharyngealization in pharyngeal harmony; these cases are
called prosodies by Vihman). Something special is taking place in child
phonology.
When we nd deletion rules, as in our initial examples for this section, or
contact assimilations like the change of ask to [st], we usually feel that
mechanical ease of articulation should account for them. But when we
contemplate distance assimilation, we nd our intuitive notion of simplicity
challenged. Why should a child who can say dad and egg nd [gg] easier than
[dg]? Is this to be explained in natural terms? In a sense, yes, but in terms of a
180 Lise Menn
different kind than we have previously considered, terms which are very
important to the construction of a theory of child phonology.
In trying to understand distance assimilation, we can get some help from
considering general motor behavior. Under what circumstances is an ABA
pattern of behavior easier than an ABC pattern (assuming that A and C are
equally easy to carry out in themselves and as sequels to B)? The only way that
doing A again can be easier is if the sequence is to some extent preassembled
or preprogrammed, for in a memoryless series of events it would not matter
whether an element is one that has recently been used. In other words, doing
A the second time is easier than doing C only if we know that we are going
to do A again and can make use of that information. So the argument goes as
follows: young children often use distance assimilation. We take as a working
assumption that this must make words easier for them. It cannot make words
easier for them unless there is a stage of production at which a word is
programmed or assembled before it is spoken. Therefore, I think that a model
of howwords are produced by young children must have such a stage in it. Later
on, we will come back to this point and try to deduce more about the properties
of this stage from the data that we have available.
(b) Other strategies: consonant harmony as a goal. Assimilation rules are not
the only way that children deal with disharmonic sequences. Some children
omit one of the offending consonants: Daniel, who used assimilation on dog,
boat, and a good many other words, said [gej] for gate, rather than [gejk] or
[dejt]. Other children use a glottal stop in place of one of the adult sounds. Such
patterns of rule use linked by similar input and similar output strongly suggest
that we should take a functional approach to child phonology rules; that is, they
make more sense if we think of themas means to some end. And in fact, we have
been doing just that: we have been assuming that these rules are somehow
designed to eliminate disharmonic sequences.
III.C.3. Output constraints and conspiracies: rst mention At this point
it will help to develop some terms for dealing with sets of rules which appear to
serve some common function. Suppose none of the forms produced by a child
contain consonant clusters, for example, or that none have nal stops, or that
none have disharmonic sequences. A statement that a particular sound pattern
does not appear in a corpus and is not expected to appear if we get a larger
sample is a statement of an output constraint. Adult languages have output
constraints as well; consonant clusters are absent from many languages, and
every language has restrictions on how many and what kind of consonants form
a pronounceable cluster (Bell 1971). Vowel harmony, present in quite a number
of languages, is also describable as an output constraint.
Following Kisseberth (1970), when we have a set of rules that all contribute
to eliminating sound patterns which would violate a particular output constraint,
we say that those rules form a conspiracy. In the example from Daniel,
assimilation rules and a (limited) deletion rule were part of the conspiracy to
eliminate disharmonic sequences.
Conspiracies of rules are not the only devices that children use to maintain
output constraints, however. Selection strategies may also contribute: children
may avoid adult words which violate a constraint. Sometimes, this may be a
very minor strategy for a particular child (Daniel probably avoided the word
cup), but sometimes it is a major contributor to the maintenance of an output
constraint.
Let us now look at some cases involving another very common output
constraint in young children. This one actually involves a pair of phenomena
collectively referred to by Ingram (1976) as voicing: the constraint that initial
stops be voiced and nal stops be unvoiced. A child may have only one of these
or neither, but the pair is very common for English-learning children.
At the acoustic-phonetic level, the statement is slightly different: initial stops
tend to be voiceless-unaspirated (short-lag VOT) and nal stops to be partially
devoiced (see again Macken and Barton 1980; N. V. Smith 1973; B. Smith
1979). This difference in statement is not important within English phonology,
but it becomes very important cross-linguistically, since voiceless unaspirated
stops count as voiced in English phonology, but as unvoiced in Spanish,
French, and many other languages. An explanation for this pair of phenomena
should be in terms of the regulation of glottal airow for discussion see Flege
and Massey (1980) and Westbury and Keating (1980). If there is any rule which
deserves to be called a natural process, surely it is the rule of nal devoicing: it
is not only found in child language, but is one of the most frequent rules in adult
language, appearing in many forms from a low-level tendency (as in American
English) to the familiar German and Russian nal devoicing rule and Turkish
syllable-nal devoicing.
So, many children use the natural-process rule of devoicing nal stops, and
many also use the natural-process rule of voicing initial stops; Joan Velten is
undoubtedly the best-known example. She said [bat] for pocket, [ba] for pie,
[bat] for bad, [ap] for up, and [zas] for sauce, to choose from a long list (Velten
1941: 867 ). There are no examples involving velar stops in output, for at this
age (23 months) Joan changed all adult velars to coronals (except for [bup],
book). Other children who have the same voicing constraint may use a selection
strategy: words beginning with /p, t, k/ or words ending with /b, d, g/ may be
avoided, and words which begin and end with the preferred sounds may be
selected.
Now let us look at a more complicated case, one in which all the three
principal stop positions of English were being produced by the child. Here the
voicing constraint is in full force in nal position: nal [p, t, k] have been
mastered, while the nal voiced stops /b, d/ are avoided, and nal /g/ is modied
by being devoiced or deleted.
The constraint has been overcome in initial position: the contrast between
initial /d/ and /t/ has been mastered and initial [k] has been acquired. Initial /p/ is
182 Lise Menn
avoided, but so is initial /g/. (Ferguson 1975 has commented on similar asym-
metries of consonant distributions in child phonology and across adult lan-
guages.) These statements are summarized in tabular form (Table 6.1). Another
important point is exemplied by these data; notice that the voicing contrast has
been mastered for initial dentals, but not for initial velars or labials, and that in
this case we cannot even say that one value of the feature is present for all three
initial stops. A feature that has been mastered (in either the control sense or the
contrast sense) in one phoneme may or may not spread to other phonemes in the
same word position. We presently do not know whether it is possible to explain
the difference between the cases in which a feature generalizes and the cases in
which it remains bound to a particular phone.
Other rule strategies besides the use of voicing or devoicing rules can be
found in children obeying the voicing constraint. We have just mentioned
Jacobs occasional deletion of nal /g/, but there are much more interesting
cases to be found. These are the children who add extra segments in order to
render a voicing contrast. It has been claimed that some children add a vowel to
the end of a word with a nal voiced stop; this brings the sound into the interior
of the word where it could be managed. Bag might be produced as [bg] or
[bg].
Also, two cases are now reported in which children added nasals rather than
vowels in their apparent efforts to preserve the voicing contrast in nal
position. Fey and Gandour (1979) presented a study of a child who found
that he could preserve the voicing of adult nal stops by adding a nal
homorganic nasal: bag became [bg]. (Phonetically this is rather less exotic
than it looks written out; the effect is just produced by releasing the velar
closure before releasing the stop articulation. However, this cannot well be
considered a natural process; there is no evidence that there is a general
tendency for speakers attempting to maintain voicing through a nal closure
to fail with this result.) Clark and Bowerman (1986) report a different use of
added nasal segments: one of her daughters added a homorganic nasal before
nal voiced stops, so that for example Bob became [bamp]. The stops them-
selves were still devoiced, but contrast was maintained (and the insertion of
the nasal should have helped to maintain the vowel-lengthening which pre-
cedes nal voiced stops in English and which in fact serves to carry the nal
voicing contrast in some dialects).
Table 6.1. Jacobs consonants
Initial Final
p absent b mastered p mastered b absent
t mastered d mastered t mastered d absent
k mastered g absent k mastered g devoiced or deleted
Now that we have seen how the notion of output constraint can serve to bring
together several rules and/or strategies under the observation that they all serve
to maintain the same output constraint, it is time to take a critical look at the
notion itself. So far, all we have is description, not explanation. To say that a rule
serves an output constraint or is part of a conspiracy is only organization of
data. But once we organize the data in this way, a plausible explanation jumps
out at us: the child is modifying unfamiliar sound patterns to make them like the
ones he has already mastered. And that means that the child has to learn sound
patterns, not just sounds. Again, output constraints are only descriptive devices;
what they describe are those sound patterns which a child has mastered vs. those
that he has not. That is why words which do not t the constraints are almost all
avoided or modied. This is the central thesis of this chapter; we shall explore
its empirical support and its implications in many of the remaining sections.
III.C.4. Another modication strategy: template matching Now let us
consider another type of modication strategy, one evidenced primarily in work
done by Vihman (1976, 1981), Macken (1979), and Priestly (1977). These cases
involve fairly violent rearrangements of sounds of adult words to match tem-
plates of preferred sound patterns. The simpler cases can just as well be consid-
ered cases of rule use, and usually are described in terms of metathesis (place-
exchanging) rules. The more complex cases, however, cannot be described by
rules without a lot of articial special-case magic, for what makes them so
complex is the fact that the childs attack on the adult word is not fully systematic.
A good simple case to begin with is Vihman (1976). A child learning Estonian
as her rst language seemed to have learned to say words containing two different
vowel sounds only if the rst vowel was lower than the second. The Estonian
words for mother, /ema/, and for father, /isa/, do not happen to followthis pattern.
For a little while, the child said just [sa] for father; then for four months she
failed to attempt either word, although both father and mother made earnest
attempts to elicit the words /ema/ and /isa/. At 15.5 months, the child began to
rearrange those words to conform to her output constraint: /ema/ emerged as
[ami] or [ani] . . . at which time /isa/ also reappeared, now pronounced [asi], and
the word /liha/, meat, was reproduced, following the same rule, as [ati].
An example of a case where the child was less systematic about the map from
the adult word to output is given in Priestly (1977) (also discussed in Ingram,
1979). Priestlys son Christopher treated virtually all stop-nal adult two-
syllable words and a fair number of vowel/sonorant-nal two-syllable words
according to the following patterns: Consonant selection:
C
1
C
2
C
1
j C
2
examples: pillow [pijal]; Brenda [bajan]; tiger [tajak]
or
C
1
C
x
C
2
C
1
j C
2
examples: rabbit [rajat]; melon [majan]
184 Lise Menn
with a few cases of idiosyncratic rearrangements, such as streamer being
produced as [mijat]. There was also a choice of vowel treatments; sometimes
Christopher was able to match two vowels of the target, but at other times he
replaced one or both by [a]. In addition to the cases already listed, consider the
apparent metathesis of vowel features involved in his rendition of woman as
[wajum]!
Other two-syllable words which ended in a vowel or sonorants were treated
without these special medial-[j] rearrangements: examples are bacon, produced
almost correctly as [bejkan], kitchen, where the medial affricate apparently
caused the only problem, rendered [kkn, ktn], and scissors, [szz].
While it is possible to discern some tendencies in Christophers assignments
of particular adult forms to particular outputs, Priestly makes it clear that there is
considerable arbitrary variation fromword to word. This fact of lexical variation
is further emphasized by Christophers variation across tokens of the same
word: monster was recorded as [majs] in weeks 4 and 6 of the study, but as
[mjan] in week 5; dragon was given as both [dajan] (week 3) and as [dajak]
(week 4).
In Priestlys case, then, the child had a favorite output shape to ll, but only a
few constraints on which consonants and vowels he picked to ll it with.
Mackens 1979 subject Si, acquiring Spanish, shows us a much more con-
strained output template that is, one which allowed a very limited set of
consonants and a much greater abandon in her treatment of the model word.
(The latter fact probably also reects the much greater proportion of polysyl-
labic words among her targets.)
Si could produce disharmonic sequences in a word only if one target con-
sonant was labial and another was dental. Adult words which met this criterion
were produced so that the labial preceded the dental; much deletion and occa-
sional metathesis occurred.
examples: manzana [mana] pelota [patda]
zapato [patda] elefante [batte]
Fernando [wanno] sopa [pwta]
In Sis case, the details of what is deleted and what is selected defy organized
statement in terms of rewrite rules. As Macken says, this is goal-directed
behavior: the child is looking for consonants that she can t into her output
template and ignoring the rest.
IV. Rule creation
IV.A. Extending the notion of ease of articulation: one key to a newtheory
When a childs production of a word fails to match the adult model, we cannot
help assuming that there must be some sense in which what he does produce is
easier than what he has failed to produce. But what sense is this? How can
[bada] be easier for Mackens Si than [daba]? Why will some children use [l] for
/j/ and others use [j] for /l/? Why do some children exploit fricatives while others
delete them, avoid them, or replace them with stops? Clearly, if we stick to our
commonsense starting assumption, then it must be the case that what is easier
for one child can be harder for another. Perhaps a little of the variation is due to
anatomical differences, but we simply do not have the means to investigate that
hypothesis. A much more fruitful approach is to assume that a great deal of
ease and difculty is not a matter of physiology at all or, to put it another
way, that physiological causes are only one factor in determining ease of
articulation for the individual child. The other factor, and I propose that it is
the major factor, is the state of a childs knowledge at a given time.
Let me give an example. A child may, as we have said, discover how to say
[l] before how to say [j], or the reverse may be true. Suppose a particular
child has discovered [l] rst, by chance. We notate this discovery as the
invention of a rule taking /l/ into [l]. Now this child may slip into her [l] while
trying to say [j], either accidentally or on purpose. If she nds the approximation
good enough, she will continue to use it: she will have thus discovered or
invented a modication rule. Again, in this case, [l] is easier than [j] only
because this child happens to have found out how to make an [l] rst.
I suggest, in short, that a two-stage discovery process is probably involved in
a childs establishment of a new articulatory gesture as her way-of-saying a
particular target sound. The rst stage is a matter of trial-and-error attempts to
match the sound sequence; the second stage is one of deliberate or accidental
overgeneralization of the success of that articulatory gesture, that is, the use of it
to render similar adult targets.
Let us consider the hypothesized scenario here in more detail, for it is the
heart of this chapters proposal for dealing with one of the fundamental prob-
lems of child phonology, namely, howcan there be so much individual variation
and yet such strong general tendencies? We suppose, then, that variability
across children originates with each child making trial-and-error starts at match-
ing adult sound patterns. For each given sound or pattern, some children will
succeed and some will fail. External factors, such as the frequency and
salience of the sound in the speech of others, may contribute to the likelihood
of success; so will internal factors: the probability of accidentally hitting on an
acceptable way to produce it and the salience of the sound in ones own speech.
We frankly do not know why some sounds are more probable than others;
Stevens (1972) notion that favored phones are those which are acoustically
stable (i.e., permit a certain sloppiness in articulation without showing appreci-
able acoustic change) is certainly an attractive idea, but we cannot yet simulate
the childs vocal tract accurately enough to test this idea with acoustic modeling.
(However, progress has recently been made in this area see Goldstein 1980.)
The accidental aspect of learning to produce target sounds is a principal source
of individual variation, but it is also a principal source of the probabilistic
universals of order of acquisition; roughly and with all due caveats, stops
186 Lise Menn
usually are acquired before fricatives, labials usually before velars, nasals
usually early, liquids usually late. (See Sander 1972, both for data on English
and for methodological considerations.) If the reader will permit me some
licence in the statement of probabilities, we might say that a [b] is a low pair,
[k] is jacks or better, [l] is a ush, [] is a straight ush, and the fricative [r ],
which Jakobson dwelt on as the latest acquired Czech phoneme, is a royal ush
in spades: some kid somewhere in Czechoslovakia is going to get it phonetically
right in her rst ten words, but dont bet on her being in your data sample.
We should stress one more thing about this proposed initial trial-and-error
stage of discovery: a child may accept her rendition of a sound even when it is
quite inaccurate. Some rules that give inaccurate renditions of adult targets
therefore arise at this rst stage. But many more may arise in the second stage, as
the child makes use of her initial accomplishment.
IV.B. Natural processes
It is quite reasonable to say that both /l/>[j] and /j/~ [l] are natural phonetic
processes, in that articulatory factors make it quite likely that a clumsy attempt
at either of themwill produce the other, rather than, say, a [t] or a [b]. Put another
way, a child with a certain amount of experience at making speech sounds with
his mouth is likely to get some of the properties of, say, [l], correct (in a word
that does not present a host of other problems): perhaps the voicing, the
continuancy, the central tongue placement, or the lack of rounding. [l] and [j]
share all of these properties, so a child who is doing well at approximating one
of these two phones is quite likely to end up with the other as his approximation
to it.
Informal observation suggests that [l] and [j] are roughly equally likely to be
found substituting for one another ([w] or a similar sound is also found
frequently for dark L [], of course). In other cases, there is a heavy bias in
favor of one of a pair of phones. For example, in word-initial position, stops are
much more likely to be discovered before fricatives and then to be used to
substitute for them. Similarly, voiced stops are likely to be used for unvoiced
stops in initial position, as we have already seen. We certainly have enough
reason to say that stopping (use of stop for fricative) and voicing are natural
in initial position; that is, we have reason to believe that there is a high,
physiologically governed probability that the child making a rst attempt at
an initial fricative will produce an initial stop, and that the child rst attempting
an initial unvoiced stop will produce a voiced stop instead. This, I think, is the
only coherent interpretation of the notion natural process, although other
views certainly appear to be held (see Stampe 1969; Ingram 1976, but also
Ingram 1979).
In summary, I propose that natural processes are really descriptions of
those pitfalls of learning to articulate which are commoner and more heavily
determined by physiology. To build a rigorous theory of the acquisition of
phonology, one must also be able to explain why children fall into those
particular pits.
And that step would still be only a beginning, for physiology only dictates what
articulatory goals are likely to be surrounded with what traps. To explain how
children succeed in avoiding or climbing out of them, we need a problem-solving
theory, a cognitive theory. The essence of such a theory for the acquisition of
phonology, again, is the trial-and-error discovery followed by application of the
discovered skill to new cases a model which will be very unsurprising to any
developmental psychologist.
IV.C. Non-natural rules
There remain some kinds of rules that are at a considerable remove from the
solution of particular articulatory problems.
Avery important kind of non-natural rule arises as the child begins to attend
to the fact that what appears to be the same morpheme is not always produced in
the same way by adults. Sometimes that child is correct in interpreting her
observations this way that is, sometimes she has indeed run into a case of
allomorphy or of stylistic variation. However, sometimes she is incorrect; what
appears to be variation in the shape of a single morpheme is in fact a case in
which the adult is sometimes using one morpheme and sometimes using two
which the child has failed to segment. For example, if a child notices the
Z-morpheme of the English possessive and plural appearing on certain
nouns but does not yet understand that the nal sibilant has one or both of
those meanings, he may develop his own phonological hypothesis about
where those nal sibilants are supposed to appear. Daniel (Menn 1971) created
a rule adding [s] to the end of all English words ending in /r/, apparently because
there was an accidental abundance of plurals and possessives on names and
objects in /r/ in his environment. He may have gured that the sibilant-nal
forms which he heard were the full and correct forms of the words which he also
heard with nal /r/ that is, he took pears as the full form of pear, Peters as the
full form of Peter, etc.
It is also the case that rules which once had an articulatory base, after they
have been invented, seemto acquire considerable autonomy and may generalize
without any further articulatory motivation. A child may apply a rule for one
segment or (sequence of segments) to a similar one even though he could have
produced the latter correctly. This seems to be the case for several rules used by
Amahl (see Smith 1978). Rules are much more than articulatory habits, then;
they are transduction habits, habits of rendering perceived targets in particular
ways. Illustrations and further discussion will be presented in Section IV.E,
Overgeneralization.
It is too early to make strong generalizations about the ages at which trans-
duction rules of different kinds can be found, but roughly, it seems that the very
youngest childrens rules are mostly those which lend themselves to
188 Lise Menn
explanations in terms of seeking solutions to articulatory problems; as these
problems are overcome, we begin to see more instances of rules that arise from
overgeneralizations of other rules, and more rules which reect the childs
guesses about the reasons for variation in words of the adult language.
IV.D. Rule origin and growth
We have already found ourselves considering the topic of rule origin; let us now
do so in more generality and depth. We have characterized transduction rules as
systematic correspondences between adult and child sound patterns, ranging
from correct renditions (/d/[d]), omissions, and natural substitutions (//
[d]) to the idiosyncratic rule inserting [s] after word-nal /r/ that we have just
discussed. There is also a range in how systematic a rule is. Some are excep-
tionless; most have a few lexical exceptions which typically consist of forms
that were learned before the child invented the rule in question, or of forms
which are the forerunners of a new rule. And some rules have so many
exceptions that they reach the point where we are better off abandoning the
attempt to write them; the Priestly case was one example of such a state of
affairs.
The evidence for the nature of rule change is somewhat sketchy, because rule
changes can take place in a short time, sometimes within a few hours. Fine-
grained longitudinal study is needed to give a picture of Before, During, and
After in such cases. This is emphatically not to say that all rule change is rapid.
Replacement of one well-established rule by another may take place over a
period of weeks (and fossil forms created by the old rule may survive
indenitely).
IV.D.1. Rule origin We have already discussed trial-and-error experimen-
tation as a source for correct transduction rules (/d/ [d]) and for natural
transduction rules. But it should be noted that a childs trial-and-error sessions
do not always lead to the formation of a rule. Even if the child manages a perfect
rendition of some sound pattern, she may be unable to capture the trick of doing
it at will. For example, Daniel (Menn 1971) made dozens of attempts at the
word peach during the period when his consonants were subject to assimilation.
If he had been able to make the beginning of the word affricate to match the
end, he presumably would have had no problem. But he had not learned to
produce any initial affricates, and his versions of the word included [dits, cit,
nits, its, pip] and [pit] itself at various times. He settled on none of them.
Yet sometimes a rule actually emerged within hours: Daniel tried [as] and
[dts] for box at 10;16, and later the same day his assimilation rule made its rst
true appearance, with dog as [gVg], a formit kept stably for months (as far as the
consonants were concerned).
The other case of rule origin in the literature has been called consolidation
(Menn 1976a). This term is used to describe the situation in which two similar
adult target sound patterns are involved in very similar trial-and error sequen-
ces, and end up being handled in the same way. Correct versions of both of the
patterns may be produced in the course of the trials. Jacob varied between [ei]
and [i] for the vowel of both tea and table for some weeks before settling on
[i] for both. The mutual inuence of similar sound patterns is clearly demon-
strated in such cases. Template matching can also originate in this fashion see
Vihman (1981).
IV.D.2. Rule generalization Rule origin can occur through rule general-
ization, for of course dividing a rule from its predecessor is often difcult or
arbitrary there is often no sense to the question is this a new rule or a
generalized version of an old one? Rule generalization basically means the
extension of a rule to new cases, and this covers two different kinds of events.
To discuss them, we need the concept of the domain of a rule. The domain of a
rule is simply the set of cases to which it is actually applied. For example, the
domain of a rule that applies to all English voiced obstruent is just the set of all
instances of /bdgvz/.
Formally, if we have an exceptionless rule, its domain is specied in its
structural description. In the example given, the structural description could be
written [ + obstruent, + voice ].
If a rule has lexical exceptions, sounds in the excepted words are not in its
domain even if they meet its structural description. Thus, if the word bad were
simply listed as a lexical exception to a rule otherwise applying to all voiced
obstruents, the /b/ and /d/ in it would be outside the domain of the rule. If, at a
later time, bad ceased to be an exception, it would by denition have been
brought into the domain of the rule and, thus, the rule would have become more
general without any change in its structural description at all. We might termthis
type of rule generalization lexical smoothing. Lexical smoothing is important
in child phonology because lexical exceptions to rules are so frequent. Yet it is
not really a change in the rule; it is only a change in the set of exceptions to it.
The other type of rule generalization is formally expressible as a relaxation of
the structural description, allowing additional phonologically dened sets of
words to be operated on by the rule. For example, a rule which at some point
applies only to nal /b/ might at a later time apply to all nal labials, or to all
nal obstruents, or to all instances of /b/. Any of those changes would bring new
sets of sounds into the domain of the rule, thus generalizing it. A relatively
technical note: in child phonology, we often have trouble determining the
domain of a rule for various reasons. Here is one interesting problem: consider
the data fromJoan Velten given above (Section III.C.3). She had no velars in her
output; she had initial voicing and nal devoicing of other stop consonants.
Should velars be considered to be in the domain of the voicing and devoicing
rules? It is easy to write the rules either way (with voicing and devoicing rules
applying directly to all stops before the conversion of velars to dentals, or with
fronting preceding voicing and devoicing). Only in the latter order can the
190 Lise Menn
voicing rules be written excluding velars and still give us the observed distri-
bution of forms. Now the fact is that when velars show up, they may not be
subject to either of the rules obeyed by the other stops, so it is preferable to write
the rules the second way, and thus to make no vacuous claims about the velars.
If the velars do show up obeying the voicing and/or devoicing rules, that would
then count as a generalization of the two rules.
IV.E. Overgeneralization
Just as in the acquisition of morphology or syntax, rule generalization can create
incorrect forms, and thus, from the adult point of view, be overgeneralization.
The term is used loosely; typically it is used when a rule produces some good
results and some bad ones. If a rule always produces modied forms (bad
results), we do not bother to call extensions of it overgeneralizations except
when they make a childs approximations worse than they were before the rule
affected them. Let us consider some examples.
Daniel (Menn 1971) had the two words down and stone rendered as [dn]
and [don] fromthe time of his rst attempts at them. Then he developed a rule of
nasal harmony he made all of the stops in a word nasal if the nal stop was
nasal. Down and stone remained lexical exceptions to this rule; that is, after he
had been saying [nns] for dance and [ein] for train for two weeks, he still
maintained the two older words in their unassimilated form. Eventually, how-
ever, there was a period of time in which he varied between [nn] and [dn]
for down, and between [non] and [don] for stone. Finally, the assimilated forms
for these two words took over completely and they were no longer lexical
exceptions to the rule. From the adult point of view, these two words were
poorer approximations to the adult model after the rule had been applied to them
than before (indeed, down had been perfect). Therefore, the generalization
involved in extending the domain of the assimilation rule to include down and
stone (a case of lexical smoothing, to use the term introduced above) is an
overgeneralization of the assimilation rule.
A change in the structural description of a rule can also produce overgener-
alization (recidivism in N. V. Smiths terminology). Here is his example from
Amahl (1973, 1523): At stage 1, /s/ and /l/ were normally neutralised as [d],
together with all the other coronal consonants . . . (I omit his description of
exceptions to this rule, which generally made coronals into [d].) Then /l/ began
to appear in As speech before any coronal consonant for example, lady was
rendered either [d
e:di] or [le:di]. So /l/ was optionally excepted from the general

treatment of coronals in certain environments. Then at stage 5 /s/ (and shortly
thereafter /l/) became [l] before any coronal consonant . . . : sausage [ldid];
shade [le:t] . . .
Here, the new rule for realizing /l/ as [l] in some environments had added /s/
and // to its domain. So it had generalized by a change in the structural
description: the input to the rule had originally been /l/, but later included /l,s,/.
What makes this an overgeneralization? Smith says: Now originally two words
such as side and light were both [dait], but after the appearance of /l/ before
any coronal consonant they became distinct as [dait] and [lait] respectively.
However, once /s/ was liquidised the two words fell together again perfectly
regularly, as [lait].
What is lost when this /l/-realization rule is generalized, then, is the contrast
between /l/ and the sibilants /s, /. (Of course, there is a compensating gain in
this case, because there is contrast of /s, / with /d/ only after the /l/ [l] rule
generalizes.)
Reviewing this section: we have seen that rule creation can take place through
probable or natural failures, such as the production of a stop for an initial
fricative, or through the consolidation of similar forms. It is not to be forgotten
that the discovery of a correct articulation for an adult sound is also a rule in the
sense of a connection between what is heard and what is produced. The childs
existing repertoire has a great deal to do with what form new rules may take.
Non-natural rules can arise when a child misapprehends an allomorphic
variation and treats it as a purely phonetic rule without semantic signicance,
or when a child performs major alterations to get a target to t a canonical form.
Rules can grow and generalize in two ways: by overcoming lexical excep-
tions (lexical smoothing) or by generalizing the class of sound patterns to which
they apply; overgeneralizations can occur as a result of either of these kinds of
rule growth.
V. Towards a psychological model of phonological development
V.A. The theoretical importance of lexical exceptions
and overgeneralizations
Lexical exceptions and overgeneralizations are important data for developing a
psycholinguistic theory of language acquisition. To begin with, overgeneraliza-
tions are inexplicable if one holds the view that the child makes word-by-word
progress towards correct productions; that is obvious.
Lexical exceptions are also inexplicable on the neo-Jakobsonian viewthat the
acquisition of phonology is purely a matter of acquiring distinctive features
(Menn 1981). After all, Daniel was able to make the distinction between nasal
and non-nasal dentals in production before, during, and after the time his nasal
assimilation rule applied: he had no problem producing daddy with initial [d]
and no with initial [n] during the time that he said [nns] for dance and so on. So
the fact that these words were originally exceptions to the nasal assimilation rule
cannot be described in terms of distinctive features. Overgeneralizations cannot
be accounted for in terms of acquisition of distinctive features either. Lexical
smoothing e.g., the overgeneralization of the nasal assimilation rule to down
and stone is certainly not a matter of learning to make a new distinction, and
neither is the loosening of structural descriptions. If we re-examine Smiths
192 Lise Menn
recidivism case, we see that it only involves a shift in mapping input dis-
tinctions onto output ones, not the introduction of new output features. (Amahl
mapped /l/ onto [l] in certain environments and all other coronals onto [d]; the
overgeneralization which then took place resulted in his also mapping two other
coronal continuant consonants, /s/ and /l/, onto [l] in those environments.)
Similarly, one cannot explain lexical exceptions or lexical smoothing
(although one can handle recidivism) within a theory which says that the
acquisition of phonology is purely a matter of overcoming natural processes.
Consider: if nasal harmony is not a natural process, then the natural process
approach is not able to deal with one of the commoner rules of child phonology.
On the other hand, if it is a natural process, one has to explain why it did not
apply to down and stone (i.e., why it was suppressed, in Stampes terms, for
these two words) initially, and then began to show up on other words and
eventually on these two themselves.
Finally, one cannot explain lexical exceptions or overgeneralizations within a
theory which might claim that the acquisition of phonology is purely a matter of
overcoming output constraints, as I might have tempted you to think in Section
III.C.3, Output constraints and conspiracies. Such a theory would be subject
to exactly the same inadequacies as Jakobsons in these cases for example, it
could not deal with the existence of lexical exceptions to rules.
Summarizing, if we want a functional, explanatory theory of the acquisition of
phonology a theory that does more than say children have rules, but the rules
sometimes have exceptions we need a theory which is more complicated.
V.B. Phonological idioms
One thing that we have just seen is that articulatory success on particular sound
patterns sometimes cannot be extended to new instances of very similar pat-
terns. The ability to say down and stone without nasal harmony apparently was
not generalizable to dance (let alone to prune or to jump).
The most spectacular cases of non-generalizable articulatory accomplish-
ments were analyzed by Moskowitz (1970b); she aptly named them (progres-
sive) phonological idioms. By this she meant words which are pronounced
quite well, sometimes perfectly, and, crucially, much better than words of
similar adult sound pattern. These are, in short, words which are exceptions to
the childs modication rules and/or output constraints. The classic example is
Hildegard Leopolds pretty. She produced this word quite accurately as one of
her rst words at about 9 months of age. However, then and for many months
thereafter, she produced no other consonant clusters and only one other word
violating consonant harmony, tick-tock. Finally, at a point after she had learned
to break the consonant harmony constraint in general, pretty was changed to
roughly [bidi], thus becoming part of the system in effect at that time. (See also
Moskowitz 1970a.)
A good many of the children studied have a few progressive phonological
idioms among their early words. These phenomena as well as the less spectacu-
lar lexical exceptions discussed in the preceding section are clearly material
which must be explained. Note that such lumpy pattern-and-exception land-
scapes are characteristic of the most closely related psychological areas that we
know of: adult language is full of idioms, and cognitive development is full of
instances in which the mastery of special cases long precedes the mastery of
general skills. It seems that child phonology is more complicated than was once
thought, but it still appears to be no more complex than adult syntax or cognitive
development. (This rather silly sounding remark is provoked by those who
complain that if one introduces all these complexities, there is no elegant theory
left any more. I believe it is one of the corollaries to Murphys law, however, that
nothing is as simple as it originally appears to be.)
V.C. Canonical forms
Ingram(1974) and Waterson (1971, 1972) have both shown that a young childs
output forms can be sorted into sets of canonical forms (Ingram) or prosodies
(Waterson). Prosody is here used in the Firthian sense of a sequence of several
archiphonemes (partially specied phonemes), and is exactly equivalent to the
notion of canonical form. The members of such a set of forms have some strong
syllable-structure restrictions in common: a set will be, say, just CV words, or
just CVCand VCwords, etc. What makes theminteresting, indeed surprising, is
that these sets are also restricted as to what phones can appear in them.
For example, taking Watersons data, one set consists of forms for y, barrow,
and ower; these are all realized by forms consisting of an open syllable with
voiced, continuant, labial onset: [w], [b]. Another set consisted of sh,
fetch, vest, brush, and dish; these were rendered as [(C)V] with the vowel
always mid-high as it is in the targets. A third set was made up of CVCV forms
in which the Cs were stops and the second syllable was an exact reduplication
of the rst; the targets mapped into this canonical form included Bobby, biscuit,
kitty. Another set, which allowed the vowels to differ, was of the form [VV],
used for Randall, nger, window, and another.
Such sets may be maintained by any of the strategies that we have discussed:
by selection of adult words that t a form, by use of a rule, or by template-
matching. We can thus see in phonological development a gradual weakening of
restrictions on the co-occurrence of phones and the realization of more combi-
nations of syllable structure with phonetic content, until we can no longer sort
the childs output into these neat sets. In this progression, phonological idioms
represent the most primitive level in the sense that they are the forms with the
tightest relationship between phonetic content and syllable shape. A little set of
lexical exceptions to a rule like down and stone represent a slight weakening in
that relationship they were produced, remember, as [dn] and [don], two
forms differing only in the vowel.
194 Lise Menn
V.D. Motor programming: a psycholinguistic account of output
constraints and canonical forms
In the preceding section, we implied an interpretation of early output constraints
and their gradual relaxation: it is as though the beginning speaker cannot vary
some feature values in the course of a single word even though he can make the
different sounds in separate words. To take a familiar example, a child with a
consonant harmony constraint may be able to make consonants at two or more
positions of articulation, e.g., be able to say toy and boy, yet be able to say only
[bb] for tub. As Waterson says (1972; 13), there is difculty in the planning
and production of rapid changes of articulation in a short space of time. There
is a sense in which the whole word, for a child such as this, can be thought of as
bearing a single specication for place of articulation. (This idea has antece-
dents in several theories of vowel harmony in adult language, e.g., Wellmers
and Harris 1942, Waterson 1956/1970.) For a child like Watersons P, an output
word must conform to one of the given canonical forms, and within that
restriction, only few degrees of freedom are left for the individual word.
We can tie all of these phenomena together and understand how they t into
an acquisition process if we make an analogy with computer programming.
Suppose that learning to pronounce a sequence of sounds is like creating a
program that the articulators and the larynx execute. A phonological idiom
would then be like an invariant program, one which has no variable parameters
that the user is free to set. A canonical form would be like a program in which
some parameters are xed, but others are settable. Let us consider some
examples using this metaphor. Assume a child has CV(CV) as the canonical
formsubsuming, say, bye-bye and baby as [baba], ball as [b], doggie as [dd],
and there as [de]. In this hypothetical case, the program can either stop after one
CV cycle or produce a second CV. The only stops are [b] and [d], which means
that there are two choices for consonant position: labial or dental; this choice is
made once for the whole word. It also means that there is no choice for voicing
or nasality within this program (which means that the canonical form should in
fact have been written out as C[ +voice, nasal] V (CV)). Note that there is
considerable freedom of choice for the vowels, but that the vowel is also
specied once for the whole word.
This child might also have another canonical form, say (C)V, like
Watersons child. This form is like a program that allows some leeway for
specications of the initial consonant and the vowel, but always nishes the
word with an []. Such forms have always been puzzling before it is easy to
imagine why assimilated forms are simpler than non-assimilated forms, but
what good are canonical forms like (C)V ?
If the programming metaphor is roughly accurate, we now have an answer
to that question. Even though a form like (C)V requires a change in the
articulatory position for the production of bush or sh, it has very few variable
parameters. Therefore, once it has been learned, it can be highly automatic to
run. The program is called up, the initial consonant is chosen, the vowel is
chosen, and it runs with no more attention than would have been necessary to
produce an open syllable. Waterson (1972; 17) noted: each word appeared to
be learned as an individual item . . . at rst there were only one or two examples
of a particular pattern and then there would be quite a sudden increase.
So now, we can describe phonological development by saying that the child
gradually learns to improve in three areas of production control: (1) she learns to
increase the number of parameters that can be freely assigned values in a given
word; the consequence of this is that more of the segments in a word can vary;
(2) she learns to increase the number of values that each parameter can take on;
this increase means that there is a wider range of possible phones that can be put
into each segmental position in a word; (3) she learns to link up short programs
to make longer ones which can generate polysyllabic words.
In summary, the patterns of language behavior that we have surveyed suggest
that the child must initially discover (by trial and error) how to make sequences
of sounds, not merely how to make segments in isolation. Some of these
sequences she learns to vary systematically in one or two respects; these we
see as groups of similar words, that is, as sets of words belonging to canonical
forms. Other sequences she does not learn the trick of varying for a long time,
possibly because they were among the most complicated to begin with; these
remain phonological idioms. Some canonical forms run into developmental
dead ends: Daniel learned only to vary the vowel in his [dVn] canonical form,
producing only down and stone ([non]) with it. But apparently he could not go
on from there to learn to vary the place of articulation of the consonants; he had
to abandon his temporary conquest of nasal disharmony and make a fresh start.
V.E. The articulatory program and the general model
V.E.1. The output lexicon We have described many typical rules of child
phonology, we have considered what might be difcult about certain sound
sequences that children seem to avoid producing, and we have seen that many
rules may be explained as devices which children invent to get around those
difculties. We have found rules that get rid of consonant clusters, of consonant
disharmony, and of particular sounds in particular environments. We have also
seen that there are some rules and looser strategies that cannot be explained in
terms of articulatory simplication, at least not in the usual sense. Instead, we
have had to invoke the idea that getting a word out involves the assembly of
some sort of articulatory program.
Let us now go back to another aspect of psycholinguistic modeling. There is
another important property of childrens output that we have mentioned but not
really discussed: the fact that some rule changes are carried out gradually.
Sometimes this can be explained, following Macken (as we did earlier), by
postulating that the child has misheard some word to begin with or has replaced
an originally correct encoding of the word by an erroneous version based on his
196 Lise Menn
own output. In either case, the result can be that when a newrule comes in which
should apply to the word, it will fail to do so because the word has the wrong
stored form. Recall that in Mackens example, taken from N. V. Smith, the child
had apparently stored take as [geik], because when all other velar-nal words
had broken free of the consonant-harmony rule, that word remained harmonized
as an exception.
But often enough, there is quite a delay in applying a new rule to a word that
is already established in the output vocabulary, and this can happen even when it
is quite unlikely that there has been any miscoding of the word. For example, we
mentioned that it took about two weeks for the nasal assimilation rule to begin to
affect Daniels down and stone, and several more before the new forms replaced
them entirely. What accounts for the persistence of these forms? The most
straightforward account, I think, is given by the two-lexicon model. What we
can say with this model is that ways-to-say words are stored, too, in an output
lexicon; application of a new rule to a word that is already in a childs active
vocabulary involves the ouster of the old form which was stored in the output
lexicon and its replacement by the new form. In this model, rules are the links
from the input lexicon to the output lexicon. To show this our original gure is
relabeled in Figure 6.2.
Lags in the adoption of a rule, in this model, simply are cases in which a child
has formed the habit of saying a certain word a certain way and maintains that
habit instead of updating it.
Now we need to t the notion of articulatory programming, which we
developed in the previous section, into the two-lexicon model. This proves
to be very easy to do. What we did in that section was to factor the stored
information about how to pronounce a word into two parts: (1) information as
to which canonical form it belongs to, and (2) information on how the variable
parameters in that canonical form should be chosen in order to produce the
word. For example, suppose that the child has an accurate rendition of dish as
part of a C[+voice] V[tense] canonical form. We viewits entry in the output
lexicon as consisting of the information that (1) it belongs to the canonical
form just mentioned and (2) the variable consonant parameters should be set
at [+dental, continuant], giving [d] since the voicing parameter has been
[Input lexicon]
(Rules)
[Output lexicon]
Figure 6.2.
xed at [+voice]; meanwhile the variable vowel parameters should be set at
[+front, +high], giving [] since there is already a xed vowel parameter of
[tense].
The actual production of a word that belongs to a canonical form thus takes
place in two stages. The rst is recall of the canonical form and the stored
variable-parameter values from the output lexicon, and the second stage is
plugging the values into the articulatory program specied by the canonical
form. Figure 6.3 shows this elaboration of the two-lexicon model.
Phonological idioms remain as output lexical entries that cannot be factored
that is, as entries in which there are no variable parameters to be set. This means
that in our model, the output lexicon contains only the specication of the
program; when it is called up, there is no plugging in of settings to be done the
articulatory program (alias the canonical form) has been stored fully specied.
V.E.2. Rules in the two-lexicon model We have occasionally used the
cover termtransduction to mean all the steps fromhearing to speaking a word.
As we have analyzed this process in terms of perception, storage, and produc-
tion, we have steadily been breaking it down into ner steps. We have said that
one of those steps is the connection between the input lexicon and the output
lexicon, and that step is mediated by rules. But we have really only talked about
rules in the usual informal mode of relating the adult model word to the childs
output word. We need to go back and see what we can deduce about the nature
of the rules that would t into our model.
These rules must account for the difference between what the child knows
about the sound of a word as stored in the input lexicon and what is stored as
canonical form membership plus variable parameter settings in the output
lexicon. In the immature speaker, there is generally a loss of information at
this step that is, kids do not make in production all the distinctions that they
can make in perception. The major function of the rules, then, is the selection of
which pieces of information about the adult word will be preserved in the output
lexicon and which will be abandoned; for this reason, we will refer to the rules in
[Input lexicon]
(Rules)
[Output lexicon: entry for each word consists
of specification of canonical form plus
specification for each variable parameter]
[Articulatory instructions]
Figure 6.3.
198 Lise Menn
our model which link the input lexicon with the output lexicon as selection
rules.
Let us rst consider how selection rules should look for a child who has
developed beyond the stage of having obvious canonical forms. For such a child
we gain very little by introducing the theoretical complexity of the factored
output lexicon, and we make our work easier if we go back to the older model in
which the output lexical entry for a word contains all the information needed to
say it (see Figure 6.2 again).
The notion of selection rule is especially convenient in discussing different
childrens treatment of consonant clusters, so we will use that topic as an
example. The commonest pattern of initial cluster reduction for children acquir-
ing English seems to be the one used in baby talk: stops and nasals are retained,
liquids and fricatives lost; [sl] and [sw] clusters seem to be indeterminate.
(Incidentally, parents tend to perceive their children as adhering to this stereo-
typic pattern even when the child actually uses a different one; see Menn 1977;
Menn and Berko Gleason 1986.)
Some children nd ways of breaking clusters apart, inserting [] or moving one
of the segments to another part of the word (e.g., saying [nos] for snow (Hamp
1974; also Waterson 1971). In this discussion, however, we need to focus on those
children who do reduce an adult initial consonant cluster to one segment, but who
do not do it just by omitting one of the segments. For /sp, st, sk/ we can nd some
children who use the roughly corresponding fricatives [] or [f], [s], [x] or [s] to
represent the cluster (also [fw] for /skw/); for /sm, sn, sl/, some children use the
devoiced counterparts [m
, n
, l
]. It is easy to see what is happening here: the child

is mapping the cluster into one segment by selecting some of the features
belonging to the rst adult segment and some to the second one. This is usually
done with considerable regularity; that is, a given child will preserve either the
fricative character or the stop character of all s + stop clusters. (The treatment of
s + nasal clusters may differ from the treatment of s + stop clusters, however.)
Selection rules which produce effects such as these can be considered as
selecting features from a particular portion of a word in the input lexicon and
then putting them in a designated slot in the output lexical entry. Here, certain
position and manner features from initial consonant clusters are taken and put
into a slot so that they will designate the initial consonant of the output word.
Now let us consider briey the character of the selection rules that would
have to be written to characterize the behavior of a child who is still operating
with strict canonical forms. These rules must map the input lexical entry onto
the two-part output lexical entry which we constructed in Section V.D.
Therefore, they must be able to take each word in the input lexicon and specify
both the canonical form (which articulatory routine will be used) and how any
variable parameters are to be set.
A great deal of the variation from one child to the next is reected right here.
Take the word snow; for some young children, this will be treated as a CV word
and most likely be produced as [no], [do] or [n
o]. Other children may put any

target word containing a sibilant into a (C)Vs class, and so produce snow as
[nos] or [dos]. A child who tries to break up the cluster with an inserted vowel,
giving [sno] would probably have a CVCV canonical form to map it onto (but
this raises problems of stress, which is clearly xed for some polysyllabic
canonical forms). A syllabic [s] for the rst syllable is another possibility.
It is by no means clear howa child goes about picking what canonical formto
assign an adult word to. She may be quite systematic about it say assigning all
two-syllable words with initial consonants to CVCV and all fricative-nal
monosyllables to CVs. But her assignments may seem rather more haphazard,
especially for words which could t equally well into either of two forms and for
words that do not t well into any form.
When we consider children whose transduction patterns are less regular and
more like template matching, it is no longer possible to write selection rules; we
must be content with guidelines. Note, however, that it is possible for there to be
a fairly reliable rule for the choice of canonical form coupled with some
roughness in the way that variable parameter values appear to be selected
(Mackens Si; Priestlys child both discussed in Section III.C.4). I do not
know if any case has been analyzed as having irregularity in the choice of
canonical forms coupled with regular rules for setting the variable parameters
once the form has been chosen.
V.E.3. Caution: the limitations of the programming metaphor We set
up all this apparatus because it does a nice job of rationalizing the transduction
patterns that we seem to nd, although there are some data that do not t as
easily as one would like. This model is valid only to the extent that producing a
word is like running off some fairly simple sort of speech synthesis program.
I enjoin the reader to consider how the theory presented in this chapter might
be modied so that it simulates the behavior of real children better than it
presently can.
VI. Saying what one hears: task variables
VI.A. Imitation, self-monitoring, and spontaneous speech
We have been using the term transduction occasionally as a cover term for the
whole process of hearing and then saying a word (regardless of the time delay
between those events). One of the major phenomena of child phonology is the
great variability that can be found in the accuracy of a single childs trans-
ductions, and the apparent relation of that accuracy to the conditions under
which the word was produced. There are three reasons why we must be able to
deal with this variability: rst, obviously, since it exists we must be able to
incorporate it in our theories; second, we must take it into account in data
collection so as to get a proper sample of a childs performance, and third, in the
assessment of phonological development for clinical purposes, we face the
200 Lise Menn
same sampling problem as in research data collection but with much greater
urgency because of the need for efcient use of time and because of the
consequences for the child.
In this section, we shall review some of the factors that are believed to be
involved in the observed variations. It is well known that imitated productions
of words may be much better than spontaneous ones; it is also known that they
may be just the same (Korte and Bond 1979) or simply different (Moskowitz
1975); and under some conditions, imitations can be worse than spontaneous
productions. This means that the factor of being imitated cannot be the only one
which produces variations in accuracy of transduction; other factors must be
interacting with it if the relation of spontaneous to imitated production is
unstable. We shall see that one of these factors is whether the target is already
in the childs output vocabulary. However, when this is taken into account there
is still a large residue of variation which does not seem to be a matter of the
choice of test words and tasks at all. Perhaps it is truly random, but there is some
evidence to suggest that another possible factor is the childs own moment-to-
moment appraisal of what task she is really being asked to do.
Let us rst consider why imitation, so often used as a research or assessment
tool, is expected to improve a childs performance, and then why it may fail to do
so. Recall that spontaneous is actually used to describe utterances elicited from
the child by any (humane) means, as long as no one says the target word itself
within several minutes prior to the childs attempt at saying it. The intended
essence of this distinction between spontaneous and imitated speech is that
spontaneous utterances require retrieval of some encoding of the sound pattern of
a word from long-termmemory, while imitation is supposed to rely on short-term
auditory memory. Thus imitation should be able to reect the childs perceptual
and articulatory capacity unencumbered by incorrect stored information.
But careful consideration of this supposition and of the data that we actually
have about perception and about imitated production shows that it is, in general,
false, especially for very young children. One always relies to some extent on
old knowledge in both perception and production, and no imitation task can be
assumed to escape this reliance. Perhaps it is minimized when the subject
succeeds in categorizing and imitating novel sounds, as in Kents task of imi-
tating foreign vowel sounds (1978). But in general, imitation does not mean
listening and reproducing without the interference of old habits; and if imitation
relies entirely on old knowledge, as it may when the child is asked to repeat a
familiar word, then imitated and spontaneous tokens of a word should be
identical.
Now let us see how imitated tokens might be worse than spontaneous tokens.
Bartons extensive work (1976, 1980) shows that children aged 2 years or
under have a strong tendency to mis-hear unfamiliar words, reinterpreting
them as familiar words which are phonetically similar. Barton attempted to do
minimal-pair word-discrimination tasks with very young subjects, and often
they had to be taught one of the words for example, a 20-month-old might
know coat but not goat. Such a child could learn to choose pictures correctly
when one picture was a goat and the other was a bull, but given the minimal pair
coatgoat to discriminate, the child tended to pick coat regardless of whether he
heard coat or goat. (The bias depended only on this familiarity factor, not on any
phonemic factors.)
The implication of Bartons comprehension study for our consideration of
imitation should be clear: he was running into a perceptual bias, and the same
bias should be present in imitation tasks, unless they use words which are rmly
in the childs passive vocabulary. Even tasks using all nonsense syllables may
be affected, since any of them may be misperceived as a familiar word. (It is
worth remembering that this bias for hearing the novel as the familiar remains
throughout life, as anyone who has an uncommon name that resembles a
common one can testify.)
Another variable which is involved in transduction is self-monitoring.
Conscious monitoring is likely to improve the quality of ones output, and in
adults such self-monitoring seems to be maximal when other cognitive loads are
reduced; we assume that the same is true for children. Waterson (1978) has
certainly shown that for one childs spontaneous speech, phonetic quality
declined as the length of the phrase produced increased; this is suggestive, but
the cognitive load variation producing the phonological variation was a very
highly linguistic one, namely the length of the utterance the word was embed-
ded in, so one must be cautious about generalizing from it.
At any rate, children do indeed go off and quietly practice new sounds (Weir
1962 is the classic reference). Some children are also observed to whisper new
words and sounds (Leopold 1939). As an aside, we might reect that the
observations of children practicing, whispering, and showing off new sounds
make the problem-solving theory of child phonology more credible (though
they in no way make the old embryonic development theory less credible). It
would be difcult to make sense of the claimthat the acquisition of the ability to
pronounce is a matter of problem-solving, if children never acted as though they
were trying to solve a problem.
However, a large part of self-monitoring must also take place below the level
of consciousness, for the amount of feedback that must be involved in the
achievement of the ne control of the native speakers accent is immense.
Returning to the main topic, now we must ask whether self-monitoring is
improved during imitation. It could be; during imitation, a child has an oppor-
tunity to compare her short-term phonetic memory of a word with her produc-
tion of it and/or with her long-term stored memory of how it should sound.
Sometimes such comparisons are made, sometimes they obviously are not.
There are plenty of recorded instances in which a child imitates an adult,
produces a deviant pronunciation ([ote] for okay, [fs] for sh), shows no
signs of dismay, and denies hotly that what she has said differs from what the
adult has said. Why the failure to spot such major discrepancies? In a good
many of these cases, we are sure that it cannot be ascribed to perceptual
202 Lise Menn
difculty, for if the adult goes on to imitate the childs faulty pronunciation, the
child also hotly rejects the adults parody. The child can tell the difference. True,
sometimes the adult produces a very crude parody; in such cases one can argue
that the child would have accepted an accurate one. But in many cases, this
dodge is not available; the child would have been capable of distinguishing her
version from the correct version, if only she were paying attention. An unpub-
lished observation of Daniel will illustrate this point. During the period in which
all initial dental and labial stops were assimilated to following velar stops,
Daniel was requested to get his toy duck; the toy was out of sight, and the
adult simultaneously pointed in the direction in which it lay. Daniel echoed the
word as [dk], went and looked in the indicated place, said [gk], and toddled
away with no further interest. I suggest the following interpretation for this
sequence of events: in the absence of nonverbal supporting context, Daniel
failed to comprehend the word duck. He repeated it correctly under this con-
dition, clearly demonstrating that he had no perceptual problem with the adult
word whatever. Then he went and found it, and repeated his familiar form for it.
It must be the childs self-monitoring that is at fault when he is at this level of
discriminatory ability and still fails to recognize that there is a difference
between his output and the adults. I suggest that there are two factors contri-
buting to this lack of attention. One has been mentioned above; if the child has
somehow arrived at the opinion of his word is adequate, he is likely to believe
that he always does it right. (See Zwicky 1982 on classical malapropisms.)
The other factor is a problem which also besets many Piagetian-style inter-
views of children: the problem of making sure that the adult and the child are
actually directing their attention to the same phenomenon. Suppose that we
consider an adult correcting a childs production of sh (to take a frequently
used example). The child says [fs], the adult says No, say [f::], and the child
indignantly responds But I did say [fs::]. The adult wants the child to attend
to the pronunciation, but how is this desire to be communicated to the child?
Language is usually used, not contemplated; children expect to listen for mean-
ing, not for sound. The child is more often disposed to understand the request to
say sh as say the sound pattern that designates the object with ns and
scales that swims than as pronounce the word sh accurately.
At the beginning of this section, we said that these transduction variables are of
theoretical, methodological, and practical importance. By now these claims are
obvious. First, as for theory, what we have seen is that the variables of attention
and task orientation need to be incorporated into any model of child phonology.
We have developed the outlines of a model of child phonology without taking
account of these variables. We will not discuss in any detail how it can be
modied to allow for them, but, for example, some more boxes and arrows
need to be introduced into Figure 6.3 to represent the following statement:
spontaneous productions come from storage in the output lexicon, but imitated
ones, to the extent that they are better than spontaneous ones, bypass the output
lexicon and draw on less-automatic production mechanisms (Menn 1979).
Second, if we want to assess a childs ability to pronounce, whether our goals are
research or remediation, we want to know whether the child is using her best
pronunciation or an old familiar one. Recognizing that we have no control over
this variable, we need to think of tasks that would make attention to pronunciation
instead of meaning more or less likely.
Some speech pathologists and researchers attempt to test a childs best
articulatory capacities by asking her to imitate nonsense words. This is intended
to reduce the childs reliance on her habitual ways-of-saying known words; it is
a very reasonable procedure, but we have seen that a child may assimilate a
nonsense word to a known word in perception.
Her target word would then be different from the one that the examiner said,
and thus there would be two sources of error: the misperception of the target and
the effects, if any, of drawing on established articulatory habits. A very useful
discussion of these and other task variables in assessment is Menyuk (1980).
VI.B. The word as means to an end
We have introduced the problem of task variables in the context of observing
and testing the child who has begun to talk. We should also consider the role of
task variables during the transition from babbling to speech.
One of the perennial puzzles of child phonology is the phenomenon that got
exaggerated into the legend of the universal silent period: frequently a child
will be able to produce a sound in babbling even though she cannot put it into
any words. How can this come about? Like any other voluntary motor perform-
ance, the production of a sound or sound sequence is easier in some contexts of
action than others. Consider producing a given sequence of sounds under each
of the following conditions:
(i) having just made the sound(s) by accident (the context for circular
babble);
(ii) having just heard someone else make the sound(s) (the context for
imitation);
(iii) having decided to make the sound for its own sake or to execute the motor
sequence that will produce the sound(s) (the context for sound-play);
(iv) having decided to obtain a goal which requires the use of the sound(s) as a
subgoal (the context for meaningful utterances).
Observation tells us that (i) is the earliest, and therefore the easiest of these
four conditions, while (iv) is the hardest. It is not clear whether (ii) is easier than
(iii), however. But the important point is that (iv) requires the ability to carry out
(ii) or (iii) plus attention to the goal of the act of speaking. We might hypothesize
that the meansends gap found here is the reason why sounds can appear in
babble before being used in speech, drawing on general principles of cognitive
development.
But there may be some other factors involved in this delay. For example, a
child might fail to realize that a sound made in play is just the one needed for
204 Lise Menn
certain words. This might happen because the recall memory for the sounds in
those words is not strong enough to bring them to mind without supportive
contexts, even though the child can recognize them when others say them.
Second-language learners will certainly recognize this kind of recognition/
recall disparity.
In conclusion, we cannot say with certainty why a child is unable to use in
words a sound that he can produce in play, but there are many possible cognitive
reasons why this might happen, so there is no point to invoking some mystery of
the language faculty until it is shown that none of these possible reasons is
plausible.
Note: It is important to make ones analogies carefully when comparing
language with other cognitive abilities. There is possible confusion about my
use of the terms means and end. Children can indeed learn to produce words
for various social and personal ends well before they show the innovative
meansends behavior that is called for on Piagetian developmental scales. But
the kind of meansends behavior required for the onset of meaningful speech is
of the most primitive variety; early words are acquired by plenty of practice and
are deployed in familiar situations for familiar purposes.
VII. The acquisition of allophones and allomorphs
So far, we have concentrated on the development of the childs ability to go
from a shallow phonemic input representation of the adults word to some
tolerable output approximation of it. But this, of course, is only the surface of
the acquisition of phonology. How do children begin to dig below the phonetic
surface?
This is a major topic, and in this section we will only discuss some theoretical
issues and cite some of the recent studies in this area. To begin with, there are
terminological problems that I would like to avoid, so I will specify the terms
I will use in this section. Amorphophonological or morphophonemic rule is one
which requires morphological information for its operation, e.g., a rule which
applies to verb stems, to plural morphemes, to members of a declensional class.
An allophonic rule is one which requires only phonological information: the
identity of neighboring sounds, boundaries, assigned stresses, etc. (Boundary
and trace markers are essentially devices for recasting morphophonological
rules as allophonic rules.)
A productive rule is one which would apply to new words coming into a
language and which can therefore be tested on nonsense words of properly
chosen shapes. The effects of nonproductive allophonic rules may persist for a
long time in redundancy rules, which specify possible output shapes of mor-
phemes without giving directions as to how aberrant morphemes are to be
rearranged.
A rule of any type, morphophonological or allophonic, productive or not,
may produce allomorphy: the appearance of a given morpheme in two or more
shapes that would be written distinctly in phonetic transcription. (Examples will
be supplied in the text as needed, rather than being given here.)
The distinction between supercial and cognitive aspects of acquisition has
been kept in clear focus in fact, has been the focus of debate in studies of the
acquisition of morphophonology. Berko Gleasons wug test (Here is a wug.
Now there are two of them; there are two . . . ) (Berko 1958/1971) contrasted the
childs ability to produce forms which might have been memorized (one glass,
two glasses) with nonword forms which could not have been heard before (tass,
tasses; gutch, gutches). Here, the pattern of /-s, -z, -z/ allomorphy is productive
in the adult language, and the test distinguishes between the child who can
produce the correct allomorphs only on familiar words and the more advanced
child who can supply them for novel words and therefore must know the
underlying pattern.
When a pattern does not reach productivity in the adult language, as is the
case with many of the alternations in the late-acquired learned vocabulary in
English, it is more difcult to assess the degree to which a speaker has acquired
a pattern rather than a list of surface forms. As McCawley (1977) has pointed
out, when a pattern is nonproductive, it is probably not necessary to go beyond
memorization of a short list of words to be a competent user of the language.
However, some techniques show that a degree of awareness of such patterns
does develop in many speakers. It should be noted that the cognitive demands of
the acquisition of the common nonproductive rules of English (trisyllabic
laxing, various stress-shift rules, velar softening) are no greater than the
demands of the acquisition of the complex productive morphophonemics of
German or Russian. (Review of the acquisition of complex morphophonologies
is beyond the scope of this chapter; the reader should consult MacWhinney
1978.)
Several techniques have been developed for studying knowledge of non-
productive morphophonemic rules. There are the memory-reversion technique
of Myerson (1975), the meaning-guessing technique of Wilbur and Menn, and
the concept-forming technique most recently used by Jaeger. The Wilbur
and Menn (1974, 1975) technique is the simplest: here subjects were given
pseudo-words created from Latin or English morphemes according to regular
nonproductive patterns, and asked to pick among three possible meanings for
example, for chibble the choices were (a) light rain, (b) a kind of smooth cloth,
(c) coarse sawdust. Responses of experimental subjects showed that attenuated
soundmeaning correspondences were indeed available to the subjects for most
of the obsolete allomorphic patterns: for chibble, 65% of the subjects chose
coarse sawdust, 22% chose light rain, and only 12% chose smooth cloth;
for the test word abducive, 72% chose distracting, 10% conserving, and
18% informing.
But as Linell (1979) correctly warns, one cannot infer awareness of particular
rules (e.g., the rules postulated by the Sound Pattern of English) just by showing
awareness of the allomorphy that those rules describe. Much more work is
206 Lise Menn
required in this area, and Jaegers, which is too complex to discuss here, is a
good start.
So far we have been discussing allomorphic relations that clearly go across
phoneme boundaries: equivalences of /s, z, z/; or of /p/ and /b/ (chip, chibble).
How do we study the acquisition of strictly allophonic rules, that is, rules which
have purely phonological conditioning? Some of these also go across phoneme
boundaries (i.e., produce neutralization) and some do not. For example, nal
devoicing of consonants in a language with a voice-voiceless distinction pro-
duces neutralization (e.g., Hund [hnt]; Hunde [hnd], where the underlying
[d] in the singular cannot be distinguished from an underlying [t] unless one
looks at the plural or another inected form). On the other hand, the lengthening
of vowels before voiced segments in English does not cause neutralization
there is no problem reconstructing an underlying segment different from the
surface form.
Allophonic rules are easy enough to study if their context can be manipu-
lated if the segment can be made to appear both in the conditioning context
and out of it. Vogel (1975) studied nasal assimilation to following stops in
Spanish, and Drachman and Malikouti-Drachman (1973) studied the same
phenomenon in Greek. The overall impression from such studies is that stages
of acquisition can be understood only from the perspective that the child is
trying to work his way back from the surface to account for the patterns he
observes; intermediate stages of rule acquisition need not look like simplied
versions of the rules written for adult phonology.
These two nasal assimilation rules, incidentally, both function in two kinds
of contexts: across morpheme boundaries, where they are easy to study by
manipulation of context, and within morphemes, where their productivity is
much harder to demonstrate. The techniques for appropriate tasks now exist,
however, such as repetition of synthetic stimuli which violate the rules, and we
can expect considerable progress on this front. In the meantime, some studies
are available on the achievement of adultlike control of the surface manifes-
tations of these rules: for example, Hawkins (1973, 1979a) on the acquisition
of proper stop duration within consonant clusters in English, and Naeser (1970)
on the duration of childrens vowels before voiced and unvoiced stops in
English.
VIII. Summary and conclusion
It would be pleasant to say: these are the facts about the acquisition of phonol-
ogy. However, we must hedge, this being a human science, and say instead:
these are the major conclusions about the early stages of the acquisition of
phonology that appear to be justied at the present time.
(1) Some children take a very holistic approach to the acquisition of phonol-
ogy; their speech is so hard to transcribe and describe that we can say little
about them in existing theoretical frameworks (Peters 1977). Even the
more analytical children sometimes resort to holistic approaches to vary-
ing degrees.
(2) The childs early acquisition of phonology has two aspects: the acquisition
of phonetic control and the acquisition of phonemic contrast. Later, the
same dichotomy extends to the acquisition of the surface forms of words
vs. the acquisition of the patterns that they are instances of.
(3) For most of the children whose approach we currently can handle, we nd
a rough division into an early period of very slow growth of the output
lexicon, and then a period of more rapid growth. However, some children
never show such a marked point of acceleration.
(4) Most of the words of the early period will have alternating vowels and
consonants. Some words will probably be very well controlled and be
more complex in structure (progressive phonological idioms); others may
be extremely vague and variable in their output token forms.
(5) During the early period and for a while thereafter, most words will fall into
groups. The words in such a group will be similar in both syllable structure
and phonetic content; they will be describable as instances of a canonical
form.
(6) The acquisition of phonemic oppositions can be studied only in terms of
syllable structures: the typical picture is for a child to have a particular
feature contrast in one position (initial, intervocalic, preconsonantal,
nal . . .) well before it appears in others. Within a given position, phone-
mic contrast may be evidenced indirectly before the child achieves good
control of the pair of adult phonetic features involved in the contrast, but
on the other hand there may be good phonetic control of one value of a
feature without the presence of a contrasting phone.
(7) The mismatches between adult model and child word are the results of the
childs trial and error attempts; they are shaped by the childs articulatory
and auditory endowments (and thus to that extent are natural) and by the
childs previous successes at sound production. All rules of child phonol-
ogy are learned in the sense that the child must discover for herself each
correspondence between the sounds that she hears and what she does with
her vocal tract in an attempt to produce those sounds.
(8) Knowledge gained by articulatory success on a particular sound or sound
pattern does not always generalize to cases which we phonologists feel to
be similar: a feature or a phone or a string mastered may remain an isolated
success for a long time.
(9) Regular mapping patterns (rules) grow, generalize, and often overgener-
alize, even to the point of diminishing the childs accuracy of production
of some words.
(10) Whole-word mapping strategies are used to varying degrees and are a
major type of irregular mapping. Even in later stages of acquisition, such
strategies can be found on the more difcult polysyllables.
208 Lise Menn
(11) Instead of modifying adult words which are not within her capacity to
produce accurately, a child may use selection strategies, avoiding prob-
lematic sounds and sound sequences and/or exploiting favorites.
(12) As implied by the phrasing of all these statements, individual variation
among children is considerable. A deterministic theory would therefore
have to be so weak as to be meaningless. Yet typical patterns emerge. The
prevailing theories allow for individual variation by considering the child
to be experimenting with solutions to the problemof howto say words. As
we look across children, trying to discover what tends to be earlier, and
therefore presumably easier, and what tends to be later, and therefore
presumably harder, we nd three articulatory sources of difculty for the
young speaker: the articulation of certain phones (e.g., [, ]), the sequenc-
ing of dissimilar consonantal targets, and departures from CVCV. . . alter-
nations. Difculties also arise from perceptual sources, including a
tendency to perceive unfamiliar forms as similar familiar ones, and (prob-
ably) an inability to take in all the information about a relatively long word
until its most salient sounds have already been well learned.
Finally, unexpected hindrances and aids may arise from the childs
current array of strategies: a sequence which should be easy may be
difcult for a particular child because it does not t into the rules or
prosodic strategies that she happens to have developed up to that time.
(13) This chapter presents the view that the childs mastery of production
mechanisms can be described as learning to (a) control the accuracy of
articulatory movements, (b) specify more contrasting articulatory targets
in a given sequence position, (c) produce more different sequence types,
and (d) concatenate sequence types.
Let us conclude by considering the assertion made in the introduction to this
chapter: that evidence fromthe studies which have become available in the last ten
or fteen years has forced a change in our basic conception of the nature of
phonological development. We can no longer sustain the developing-embryo
model; we need problem-solving models to make sense of peculiarly skewed
output distributions such as we nd in children who avoid or exploit heavily.
Just as in the acquisition of morphology and syntax, what has been called the
implicit dening question of our research has changed. We used to ask: what
linguistic theory will explain the order in which the various language behaviors
develop? This question assumed that there is such an order, and that it should be
explainable by linguistic theory. The new question is roughly: what behavioral
predispositions and abilities does the child bring to the task of learning to
communicate with language, and how does the individual go about solving the
articulatory and phonological problems posed by the language to be learned? The
presuppositions of the second question differ markedly from the rst. We now
presuppose that there are a variety of predispositions and abilities of memory,
motor control, perception, etc. including perhaps some purely linguistic
predispositions which might have evolved just for handling the special rapid
information processing and complex pattern learning involved in the acquisition
and use of language. We also presume that the notion of problem solving is the
best heuristic for explaining the kind of very rough consensus of developmental
order that we nd in the data. As for the old assumption that linguistic theory can
explain what we nd in acquisition, we have seen that the more likely scenario is
that linguistic theory and acquisition data will have to come to terms with one
another. Atheory based only on the performance of the mature skilled user cannot
anticipate the temporary learning devices and detours of the unskilled learner.
Note: In this chapter, the feminine form has often been used for the indenite
pronoun. The reader may not realize it, but my female colleagues and I are still
receiving professional form correspondence for example, reprint requests that
address us as Dear Sir / Sehr geehrter Herr / Cher monsieur. At least until I have
evidence that more scientists in this eld can conceive of their fellows in two sexes
as well as in three languages, I think it well to jog their sense of markedness a bit.
note
1. Authors note (2012): A better description of this phenomenon, as I eventually
discovered in teaching phonetics, is that when speakers who do not make these
distinctions are asked to attend to them, they can distinguish the three words, but
they cannot tell which is which without explicit reference to the spelling.
References
Barton, D. P. (1976). The role of perception in the acquisition of speech. PhD disserta-
tion, University of London. Circulated by Indiana University Linguistics Club.
Barton, D. P. (1980). Phonemic perception in children. In G. Yeni-Komshian,
J. F. Kavanagh, and C. A. Ferguson (eds). Child phonology: perception and pro-
duction, vol. 2, pp. 97116. New York and London: Academic Press.
Bell, A. (1971). Some patterns of occurrence and formation of syllable structures. In
Working Papers on Linguistic Universals. Linguistics Department, Stanford
University, 6, 23137.
Berko, J. (1958). The childs learning of English morphology. Word, 14, 1507.
Reprinted in A. Bar-Adon and W. Leopold (eds.), Child language: a book of
readings, pp. 15367. Englewood Cliff, NJ: Prentice Hall, 1971.
Branigan, G. (1979). Sequences of words as structured units. Unpublished PhD disser-
tation, Boston University School of Education.
Butterworth, B. (ed.) (1983). Language production, vol. 2. London: Academic Press.
Clark, E. V. and Bowerman, M. (1986). On the acquisition of voiced stops. In J. A. Fishman,
A. Tabouret-Keller, M. Clyne, B. Krishnamurti, & M. Abdulaziz (eds.), The
Fergusonian impact, vol. I: From phonology to society, pp. 5168. Berlin: Mouton
de Gruyter.
Clumeck, H. (1977). Studies in the acquisition of Mandarin phonology. Unpublished
PhD dissertation, University of California at Berkeley.
Drachman, G. and Malikouti-Drachman, A. (1973). Studies in the acquisition of Greek as a
native language. Ohio State University Working Papers in Linguistics, 15, 99114.
Farwell, C. B. (1976). Some strategies in the early production of fricatives. Papers and
Reports in Child Language Development, 12, 97104. Stanford University
Linguistics Department.
210 Lise Menn
ition. Language, 51, 41939. Reprinted in this volume as Chapter 4.
Ferguson, C. A. and Macken, M. A. (1983). Phonological development in childrens play
and cognition. In Keith E. Nelson (ed.), Childrens Language, vol. 4, pp. 23154.
New York: Gardner Press.
Ferguson, C. A., Peizer, D. B., and Weeks, T. (1973). Model-and-replica grammar of a
childs rst words. Lingua, 31 (1), 3565.
Fey, M. and Gandour, J. (1979). Problem-solving in early phonology acquisition. Paper
read at the Annual Meeting of the Linguistic Society of America, Los Angeles.
(1982). Rule discovery in early phonology acquisition. Journal of Child Language, 9,
7182.
Flege, J. E. and Massey, K. P. (1980). English prevoicing: random or controlled? Paper
read at the Summer Meeting of the Linguistic Society of America.
Goldstein, U. (1980). An articulatory model for the vocal tracts of growing children.
Unpublished PhD dissertation, Electrical Engineering Department, MIT.
Halliday, M. A. K. (1975). Learning how to mean: explorations in the development of
language. London: Edward Arnold.
Hamp, E. H. (1974). Wortphonologie. Journal of Child Language, 1(1), 2878.
Hawkins, S. (1973). Temporal coordination of consonants in the speech of children:
preliminary data. Journal of Phonetics, 1, 181217.
(l979a). Temporal coordination of consonants in speech of children: further data.
Journal of Phonetics, 7, 23567.
(1979). The control of timing in childrens speech. In Proceedings of the Ninth
International Congress of Phonetic Sciences. Copenhagen.
Ingram, D. (1974). Phonological rules in young children. Journal of Child Language, 1,
4964.
(1976). Phonological disabilities in children. New York: Elsevier.
(1979). Phonological patterns in the speech of young children. In P. Fletcher and
M. Garman (eds.), Language acquisition, pp. 13348. Cambridge University Press.
Itkonen, T. (1977). Notes on the acquisition of phonology. English summary of
Huomioita lapsen nteistn kehityksest. Virittj, 279308.
Jaeger, J. J. (1980). Categorization in phonology: an experimental approach.
Unpublished PhD dissertation, University of California at Berkeley.
A. Keiler. The Hague: Mouton. (Originally published as Kindersprache, Aphasive
und allgemeine Lautgesetze. Uppsala: Almqvist & Wiksell.)
Jones, L. G. (1967). English phonotactic structure and rst-language acquisition. Lingua,
19,159.
Kent, R. D. (1978). Imitation of synthesized vowels by preschool children. Journal of the
Acoustic Society of America, 63, 11938.
Kisseberth, C. W. (1970). On the functional unity of phonological rules. Linguistic
Inquiry, 1, 291306.
Korte, S. S. and Bond, Z. S. (1979). Childrens spontaneous and imitative speech: An
acoustic analysis. Paper read at meeting of American Speech and Hearing Society,
November.
Leonard, L., Schwartz, R., Folger, M. K., and Wilcox, M. J. (1978). Some aspects of
children phonology in imitative and spontaneous speech. Journal of Child
Language, 5 (3), 40316.
Leopold, W. F. (193949). Speech development of a bilingual child, vols. IIV.
Evanston: Northwestern University Press.
Linell, P. (1979). Psychological reality and the concept of phonological rule. In
Proceedings of the Ninth International Congress of Phonetic Sciences. Copenhagen.
units of acquisition. Lingua 49, 1149. Reprinted in this volume as Chapter 5.
(1980). The childs lexical representation: the puzzle-puddle-pickle evidence.
Journal of Linguistics, 16, 117.
Macken, M. A. and Barton, D. (1980). The acquisition of the voicing contrast in English:
a study of voice onset time in word-initial stop consonants. Journal of Child
Language, 7, 4175.
MacWhinney, B. (1978). The acquisition of morphophonology. Monographs of the
Society for Research in Child Development 43, 12.
McCawley, J. (1977). Acquisition models as models of acquisition. In R. Fasold and
R. Shuy (eds.), Studies in language variation, pp. 5164. Washington, DC:
Georgetown University Press.
Menn, L. (1971). Phonotactic rules in beginning speech. Lingua, 26, 2254l.
(1976a). Pattern, control, and contrast in beginning speech: a case study in the
acquisition of word form and function. Unpublished PhD dissertation, University
of Illinois. Circulated by Indiana University Linguistics Club.
(1976b). Semantics of intonation contour in late babble and beginning speech
(English). Paper read at the Summer Meeting, Linguistic Society of America.
(1977). Parental awareness of child phonology. Paper read at the Annual Meeting of
the Linguistic Society of America.
(1979). Transition and variation in child phonology: modelling a developing system.
In Proceedings of the Ninth International Congress of Phonetic Sciences.
Copenhagen.
(1981). Review of S. E. Blache, The acquisition of distinctive features. Language, 57,
9538.
Menn, L. and Haselkorn, S. (1977). Now you see it, now you dont: tracing the develop-
ment of communicative consciousness. In J. Kegl (ed.), Proceedings of the Seventh
Annual Meeting of the NorthEast Linguistic Society.
Menn, L. and Berko Gleason, J. (1986). Baby talk as stereotype and register. In
J. A. Fishman et al. (eds.), The Fergusonian impact: vol. 1: From phonology to
society, pp. 11125. Berlin: Mouton de Gruyter.
Menyuk, P. (1977). Language and maturation. Cambridge, MA: MIT Press.
(1980). The role of context in misarticulations. In G. Yeni-Komshian, J. Kavanagh,
and C. A. Ferguson (eds.), Child phonology, vol. I. New York and London:
Academic Press.
Moskowitz, A. (1970a). The two-year-old stage in the acquisition of English phonology.
(1970b). The acquisition of phonology. Working paper no. 34, Language-behavior
Research Laboratory, University of California, Berkeley.
(1975). The acquisition of phonetics: a study in phonetics and phonology. Journal of
Phonetics 3, 14150.
Myerson, R. (1975). Adevelopmental study of childrens knowledge of complex derived
words of English. PhD dissertation, Harvard Graduate School of Education.
Naeser, M. A. (1970). The American childs acquisition of differential vowel duration.
Technical report no. 144 (in two parts), Wisconsin Research and Development
center for Cognitive Learning. University of Wisconsin, Madison WI.
Nakazima, S. (1972). A comparative study of the speech development of Japanese and
American children, part IV. Studia Phonologica, 6, 137.
212 Lise Menn
Peters, A. M. (1977). Language learning strategies. Language, 53, 56073.
Platt, C. and MacWhinney, B. (1983). Solving a problem vs. remembering a solution:
error assimilation as a strategy in language acquisition. Journal of Child Language,
7, 4175.
Sander, E. K. (1972). When are speech sounds learned? Journal of Speech and Hearing
Disorders, 37, 5563.
Schwartz, R. G. and Leonard, L. B. (1982). Do children pick and choose? An examina-
tion of phonological selection and avoidance in early lexical acquisition. Journal of
Child Language, 9, 319336.
Slobin, D. 1. (1966). Comments on Developmental Psycholinguistics. In F. Smith and
G. A. Miller (eds.), The genesis of language, pp. 8591. Cambridge, MA: MIT
Press.
(1973). Cognitive prerequisites for the development of grammar. In C. A. Ferguson
and D. 1. Slobin (eds.), Studies of child language development, pp. 175208. New
York: Holt, Rinehart & Winston.
Smith, B. L. (1979). A phonetic analysis of consonantal devoicing in childrens speech.
Press.
(1978). Lexical acquisition and the acquisition of phonology. Summer Forum
Lecture, Linguistic Institute of the Linguistic Society of America.
Snow, C. E. (1977). The development of conversation between mothers and babies.
Stampe, D. (1969). The acquisition of phonemic representation. Proceedings of the Fifth
Regional Meeting of the Chicago Linguistic Society, pp. 43344.
Sterne, D., Jaffe, T., Beebe, B., and Bennett, S. L. (1975). Vocalizing in unison and in
alternation: two modes of communication in the mother-infant dyad. In D. Aaronson
and R. W. Rieber (eds.), Annals of the New York Academy of Sciences, vol. 263:
Developmental psycholinguistics and communication disorders.
Stevens, K. N. (1972). The quantal nature of speech: evidence from articulatory-acoustic
data. In P. B. Denes and E. E. David (eds.), Human communication, a unied view,
pp. 5166. New York: McGraw-Hill.
Velten, H. V. (1941). The growth of phonemic and lexical pattern in the infant. Language,
19, 4404. Reprinted in A. Bar-Adon and W. Leopold (eds.), Readings in child
language, pp. 8291. Englewood Cliffs, NJ: Prentice-Hall, 1971.
Vihman, M. M. (1976). From prespeech to speech: on early phonology. Papers and
Reports on Child Language Development, no. 12, Stanford University Linguistics
Department.
Vihman, M. M. (1978). Consonant harmony its scope and function in child language.
In J. H. Greenberg (ed.), Universals of human language, vol. III, pp. 281334.
Stanford University Press.
Vihman, M. M. (1981). Phonology and the development of the lexicon: evidence from
childrens errors. Journal of Child Language 8, 239264.
Vogel, I. (1975). Nasals and nasal assimilation patterns in the acquisition of Chicano
Spanish. In Papers and Reports on Child Language Development no. 10, Stanford
University Linguistics Department.
von Rafer-Engel, W. (1973). The development from sound to phoneme in child
language. In C. A. Ferguson and D. I. Slobin (eds.), Studies of child language
development, pp. 912. Trans. from Proceedings of the Fifth International
Congress of Phonetic Sciences, Munster, 1964.
Waterson, N. (1956/1970). Some aspects of the phonology of the nominal forms of the
Turkish word. In F. R. Palmer (ed.), Prosodic analysis, pp. 17487. Oxford
University Press.
(1970). Some speech forms of an English child: a phonological study. Transactions of
the Philological Society, 1, 1240.
(1971). Child phonology: a prosodic view. Journal of Linguistics, 7, 179221.
Reprinted in Waterson 1987. Reprinted in this volume as Chapter 3.
(1972). Perception and production in the acquisition of language. Proceedings of the
International Symposium on First Language Acquisition, Florence. Reprinted in
Waterson 1987.
(1978). Growth of complexity in phonological development. In N. Waterson and
C. E. Snow (eds.), The development of communication, pp. 41520. New York:
Wiley. Reprinted in Waterson 1987.
(1987) Prosodic phonology: the theory and its application to language acquisition
and speech processing. Newcastle upon Tyne: Grevatt & Grevatt.
Weir, R. (1962). Language in the crib. The Hague: Mouton.
Wellmers, W. E. and Harris, Z. S. (1942). The phonemes of Fanti. Journal of the
American Oriental Society, 62, pp. 31833.
Westbury, J. R. and Keating, P. A. (1980). A model of stop consonant voicing and a
theory of markedness. Paper read at the Annual Meeting of the Linguistic Society of
America.
Wilbur, R. B. and Menn, L. (1975). Psychological reality, linguistic theory, and the
internal structure of the lexicon. San Jose State University Occasional Papers in
Linguistics. Program in Linguistics, San Jose State University.
Wilbur, R. B. (1981). Theoretical phonology and child phonology: argumentation and
implications. In D. Goyvaerts (ed.), Phonology in the 1980s, pp. 40329. Ghent:
StoryScientia.
Wilbur, R. B. and Menn, L. (1974). The roles of rules in generative phonology. Talk
presented at summer meeting, Linguistic Society of America.
Yeni-Komshian, G., Kavanagh, J., and Ferguson, C. A. (eds.) (1980). Child phonology:
perception and production. New York and London: Academic Press.
Zwicky, A. M. (1982). Classical malapropisms and the creation of a mental lexicon. In
L. Menn and L. K. Obler (eds.), Exceptional language and linguistics, pp. 11532.
New York and London: Academic Press.
214 Lise Menn
Part III
Cross-linguistic studies
7 One idiosyncratic strategy in the acquisition
of phonology
T. M. S. Priestly
Introduction
The data presented here are offered as a possible example of an insight into
underlying forms (in the sense used by Ingram1970, 1974a, and in any case as a
clear and extensive example of what Ingram (1974a) calls an idiosyncratic
phonological rule, or what Ferguson and Farwell (1975) call an individual
learning strategy. Such rules/strategies although idiosyncratic and transient
are none the less informative: il ny a chez [les enfants] ni incohrence ni effets
du hasard . . . Ce nest pas le tireur maladroit qui frappe laventure, cest un
bon tireur qui ne dispose que dune arme dfectueuse ou mal pointe . . .
(Grammont 1902: 62); the individuals phonological idioms at any age are
not mysterious aberrations, but are manifestations of the natural course of
phonological development (Ferguson and Farwell 1975: 438).
Data
The forms listed in Appendix II were observed over a thirteen-week period,
from when the writers son Christopher (C) was 1;10.2 to when he was 2;1.4. At
rst, words were noted sporadically; when it became clear that C was doing
something not only amusing but systematic, the note-taking was tackled more
systematically too. All Cs forms in Appendix II were attempts at polysyllabic
words in his parents speech; but not every polysyllabic word was subjected to
this strategy. During the fourth week of observation (W4) a list was made of the
other, more ordinary forms spoken by C; these are labeled bisyllabic ordinary
forms (BOFs) and are exemplied in Appendix I. This list totaled 68 words, and
thus almost equaled the list of words which were experimented upon
(70 words) in toto, and is twice the length of the list of experimental forms
noted during that particular week. Possible reasons for the choice of the
strategy and why it was not applied to these words are discussed below.
Meanwhile, it may be noted that all Cs forms in Appendix II, henceforth
bisyllabic experimental forms (BEFs), have medial [j], and correspond to
forms in his parents speech without medial [j].
On a number of occasions forms were deliberately elicited from C, e.g.,
Parent: Say monster. C: [majn]. The problem of the validity of elicited
217
forms, discussed (for example) by Edwards and Garnica (1973) and Ferguson
and Farwell (1975), arises here. Edwards and Garnica report that there are no
substantial or systematic differences between spontaneous and imitated forms;
apparently contrary reports (Ljamina 1958; McNeil and Stone 1965; Suxanova
1968) refer to imitated meaningless words, and thus do not apply to the situation
described here, where all the input forms were related to something (if only a
picture of an unusual animal in a book, thus approximating the situation
with Berkos (1958) wugs and gutches). In any case, whatever the disadvan-
tages of the elicitation approach may be, it is assumed that they were by far
outweighed by the advantages of securing a more complete list of data.
The analysis and conclusions are based on the whole list, with no distinction
between spontaneous and elicited forms.
Gradually the 70 input forms in question came to be pronounced by C not as
BEFs with medial [j], but as more recognizable forms with medial consonants
more similar to those of the input. These ordinary replacement forms (ORFs)
were noted only once each, when rst heard, unless they were repeated in a very
different phonetic form. Many of the ORFs were deliberately elicited in W12
and W13. (Note that two of these, (11) and (68), are not altogether normal;
see Analysis below.)
During W7, W10, W11, and W12 C pronounced six words that can
be classed as neither BEFs nor ORFs. These had the phonetic shape V C or
CV C, and corresponded to input forms for which BEFs had already been
produced. These monosyllabic experimental forms (MEFs) are also listed in
Appendix II ( indicates length).
In Appendix II, the numerals refer to the weeks during which the forms
were noted. Items (16), (356), (49), (52), and (65) were noted without
reference to the week, and were not checked at the end of the experimental
period for their ORFs. Items (3), (8), (18), (53), (578) require special
discussion; see the following section. The transcription is a broad phonetic
one, and represents a degree of normalization that is important in at least
three aspects. Firstly, [j] represents the palatal glide, and was not particularly
tense; its ambiguous position with regard to syllabication will be discussed
later in this chapter. Secondly, [a] represents what was assumed to be free
variation between [a] and []. Thirdly, voiced obstruents were normally, but
not always, devoiced in prepausal position; they are all shown here as
voiceless. This normalization, however, only refers to experimental forms
(BEFs and MEFs); towards the end of the period in question, C was clearly
achieving the [a] vs. [] and the prepausal voiced vs. voiceless distinctions;
for simplicity, all ORFs are shown with these distinctions. Stress is shown on
ORFs, but not on BEFs; on the rare occasions when one syllable was more
obviously stressed than the other, it was the rst syllable that was affected.
Both of Cs parents speak two varieties of Southern British English very
close to RP.
218 T. M. S. Priestly
Idiomatic forms
Inspection of Appendix II shows obvious correspondences between output and
input forms, and the analysis presented belowreveals a great deal of systematicity
in these correspondences. There are, however, some output forms which are
exceptional. Some of these can be explained as involving phenomena such as
metathesis; for others, however, on-the-spot observation suggested that what
Paesov (1968: 232) calls paronymic attraction was involved. We shall call
these items idiomatic. They are:
(18) chocolate: one of the rst BEFs used by Cwas [kajak]. Because of its early
occurrence, or its similarity to the word kayak, or for other reasons, this
became in its adult form [kjk] an alternative household word for
chocolate; and although C did not normally understand his parents
attempts to replicate his own BEFs, he did so in this instance. Hence the
double entry under Input for this item in Appendix II; hence also,
perhaps, the inconsistencies with which C pronounced this word, both
as BEF and as ORF.
(3), (8) tooth-paste, police-car: in W4, C rst called his toy police-car [pija].
On one occasion, reference to the toy was confused (by the parent in
attendance) with reference to a tube of toothpaste; Calso became confused,
and thereafter used this form to refer to both objects.
(53), (578) medicine, monster, music: the same kind of history probably
lay behind the use of [mjas] as the BEF for these three items (e.g., taking
cough medicine while watching a TV program), although no one incident
was noted by Cs parents.
In the analysis of the data, we include (53) and (578), since the evidence for
paronymic attraction is incomplete; where any of the three is exceptional, this
kind of confusion may be adduced. We exclude (3) because the confusion was
quite obvious. We include all the forms for chocolate only because this is one of
the few forms to have a MEF, and is therefore extraordinarily informative.
Analysis
In making the rst analysis of the data, we treat only the relationships between
input and output forms. Other relationships are not assumed a priori, and are
discussed later (as coincidences and reversions). In this preliminary analysis,
we state our ndings in terms of correspondences, to avoid making hypotheses
about the processes involved. The equation (42a) [nl]:[fajan] is the reiter-
ation of a datum; the correspondences []:[f], []:[a], etc., which are extracted
from this datum, are with all the other correspondences set up on the basis of
phonetic similarity and regularity of occurrence.
We analyse the InputBEF equations, the InputMEF equations, and
the InputORF equations separately. Since the rst-named involve complex
One idiosyncratic strategy in the acquisition of phonology 219
systems of correspondences, we deal with consonantal and vocalic equations
separately before making a combined analysis.
InputBEF equations: consonants
In item (50) [h]:[haja] it is clear that there is a close phonetic correspond-
ence between input-initial [h] and BEF-initial [h], and between input-medial [)
and BEF-nal []; and also that BEF-medial [j] corresponds to no consonant in
the input. We therefore draw up the equation [h-]:[h-j-], where the sequential
positions of the phonetic symbols represent the syntagmatic position of the
sounds. Similar close phonetic correspondences hold for (2a), (10), (12), (15),
(21), (32), (478), (63), and (64b). Generalizing from these 11 instances, we set
up the formulaic equation
C
1
C
2
: C
1
jC
2
(Equation A)
where the subscript numerals identify and equate the symbols appearing on
either side of the colon. Looking further, this formulaic equation can be
extended to embrace many more items in the data, if the criterion of phonetic
similarity is relaxed. The degree of relaxation is of course arbitrary.
Normally, the analysis of child phonology employs foreknowledge about
possible and probable acquisitional processes; but when an idiosyncratic
strategy is at issue, caution is required. However, the fact that the decisions
must be arbitrary at this stage need cause no concern; we are merely setting
up formulae for subsequent detailed evaluation. Arbitrarily, therefore, we
include under Equation A, in addition to the 11 items listed above, the
following: (4), (6), (11), (13a), (17), (19), (301), (359), (40b), (41),
(42a), (43), (46), (51), (53), (578), (61b), (65), and (689); and, with
reservations, (13b), (18a, b), and (34).
It is noteworthy that all the input forms involved so far have a single stress on
the initial syllable, with the exception of (12). Using the notation C
pt
to denote
post-tonic consonant, therefore, we can cover all the items mentioned so far
with the equation
C
1
C
pt
: C
1
jC
pt
since in all the items except (12), C
pt
= C
2
; while in (12) C
pt
and C
2
are both [n].
It may also be pointed out that (62) is covered by this reformulation of Equation
A, but not by Equation A itself. See Conclusion below for further discussion.
If, next, (61a) is considered, [rbt]:[rajat], a different formulaic equation is
obviously required. Here there is a clear phonetic correspondence between the
input form and the BEF with regard to their initial and nal consonants, but
there is no phonetic similarity between the input-medial [b] and the BEF-medial
[j]; or, in equational form, [r-t]:[r-j-t]. The same sort of equation, with close
phonetic correspondences, holds for (1), (2b), (5), (78), (17), (19), (22a),
(234), (31), (33), (49), (52), (54a), (56), (62), (65), and (67). On the basis of
these 19 correspondences, we set up the general formulaic equation
C
1
C
f
: C
1
jC
f
(Equation B)
where C
f
means nal consonant. We further extend Equation B, in more or
less arbitrary fashion, to cover 13 more items: (9), (256), (28), (40a), (41),
(42b), (45), (51), (55), (60), (66a), and (70).
It must be noted that some items are equally well covered by both Equation A
and Equation B; for example, the nal [t] in (51) may be equated with either the
medial [dr] or the nal [nt] of the input form [hjdrnt].
A subset of the items covered by Equation B must be studied further. (22a),
for example, has the correspondence [k-r-t] :[k-j-t]; here not only initial and
nal, but also the medial consonants seemto correspond: the phonetic similarity
between [r] and [j] may be considered sufcient. The same [r]:[j] equation
obtains in (5), (7), (9), (23), (28), (45), (67), and (70); and another phonetically
similar equation, [l]:[j], occurs in (2b), (8), (54a), and (66a). We therefore have
to allow for another formulaic equation:
C
1
C
2
C
f
: C
1
j
2
C
f
(Equation B)
which may replace Equation B in the nine instances where the input form has
medial [r] and the four instances where it has medial [1]. We do not extend the
criterion of similarity to cover other input-medial consonants: see the dis-
cussion of substitutions, below.
We are left with a residue of only 7 items not covered by Equation A or by
Equation B or by both: (13c), (14a), (16), (20), (29), (44), and (59). In some of
these cases, equations involving metathesis and the like can be suggested;
but there is no regularity, and these items are best left for later discussion.
InputBEF equations: vowels
We proceed in the same way with the analysis of the vowel correspondences.
First, we draw up the equation
V
1
V
2
: V
1
V
2
(Equation I)
on the basis of examples such as (56), [m
skks]:[majks]. With certain

extensions of the phonetic similarity criterion, this equation is held to cover
(1), (2b), (46), (10), (13), (19), (22a), (236), (28), (30b), (327), (3943),
(457), (501), (53), (54a), (56), (59), (63), (64b), (67), and (69).
A set of other items, such as (2a), [pilow]:[pijal], requires a different equa-
tion, where the second vowel in the BEF does not correspond to the second
vowel in the input. For this item and 11 others not covered by Equation I,
V
1
X : V
1
a (Equation II)
is required. The 11 items are (2a), (9), (11), (15), (18b), (31), (38), (52), (60),
(61a, b), and (65). This equation, being more general, also covers most (but not
all) of the items covered by Equation I: for details, see Table 7.1.
Neither Equation I nor Equation II covers the vocalic correspondence in (48).
For this item, and for 4 others not yet covered, we require a third equation:
V
1
X : a V
1
(Equation III)
Table 7.1. InputBEF equation
a
AI C
1
V
1
C
2
V
2
: C
1
V
1
j V
2
C
2
(64b) ij m juw : i j u m
(47b) w s l : w i j u s
AII C
1
V
1
C
2
: C
1
V
1
j a C
2
(2a) p 1 ow : p i j a 1
(18b)? t kl t : k a j a k
AIII C
1
V
1
C
2
: C
1
a j V
1
C
2
(27) k ow st : k a j ow s
(57b, c) m nst : m a j n/s
BI C
1
V
1
V
2
C
t
: C
1
V
1
j V
2
C
f
(2b) p 1 ow : p i j ow
(56) m sk ks : m a j ks
BII C
1
V
1
C
f
: C
1
V
1
j a C
f
(9) b r ij z : b j a s
(52) m n t : m i j a t
(60) r k dz : r j a s
BIII C
1
V
1
C
f
: C
1
a j V
1
C
f
(7b) p r d : p a j s
(70b) r nd : a j t
BI C
1
V
1
C
2
V
2
C
f
: C
1
V
1
j
2
V
2
C
f
(2b) P i 1 ow : p i j ow
BII C
1
V
1
C
2
C
t
: C
1
V
1
J
2
a C
f
(9) b r ij z : b j a s
BIII C
1
V
1
C
2
C
f
: C
1
a j
2
V
1
C
f
(7b) p r d : p a j s
(70b) r nd : a j t
a
Most other items in the data are ambiguous in the sense that they are covered by more than
one of these equations. The details are as follows: AI or AII: (30b), (357), (39), (46), (47a), (53),
(69); AII or AIII: (11), (15), (38), (61b); AI or AII or AIII: (4), (6), (10), (13a, b), (32), (34), (40b),
(42a). (43), (50), (63); BI or BII: (1), (33), (45), (54a); BII or BIII: (61a); BI, BII or BIII: (5), (22a),
(236), (28), (40a), (42b); AII or BII: (65); AI, AII, BI or BII: (19), (41); AII, AIII, BII or BIII: (31);
AI, AII, AIII, BI, BII or BIII: (51); BI or BII: (45), (54a); BI, BII or BIII: (5), (22a), (23), (28).
In addition, some items cannot be assigned to a combined equation because of some anomaly. Thus,
(12), (18a), (21), (30a), (57a), (58), (68) are covered by A but the vocalic equation is uncertain; (7a),
(8), (17), (49), (55), (66a), (67), (70a) are covered by Bbut the vocalic equation is uncertain; of these
latter, (7a), (8), (66a), (67), (70a) are also covered by B. Further, (62a, b) are covered by A or B
but the vocalic equation is uncertain. (13c) is covered by I, II, and III but the consonantal equation is
uncertain. Finally, neither vocalic nor consonantal equations may be assigned to the residue, viz.
(14a), (16), (20), (29), (44), (59), (64b).
which holds for (7b), (27), (48), (57b, c) and (70b). This also covers items which
are covered by Equations I and II, or by both; see Table 7.1.
In this way, all but 10 of the 87 bisyllabic correspondences are dealt with;
the residue contains (7a), (14a), (17), (18a), (30a), (49), (57a), (58), and (70a).
The 12 items with polysyllabic input are then considered; all would be covered
by one of the equations suggested, with appropriate modications to cover the
extra syllables.
Combined equations for inputBEF correspondences
The consonantal equations A, B, and B combine with the vocalic equations I,
II, and III to form nine equations. Each of these is required specically for
only one, two or three of the items in the data; but, interestingly, all are
required. All the other items in the data are covered by more than one
equation. The details are presented in Table 7.1. The residue of items which
are not listed on this table is as follows: (7a), (8), (12), (13c), (14a), (1617),
(18a), (201), (29), (30a), (44), (49), (55), (57a), (58), (59), (62), (64b), (66a),
(678), (70a).
InputMEF equations
Three of the six MEFs, (22b), (54b), and (66b), correspond to their input forms,
with close phonetic similarity, according to the equation
C
1
V
1
C
f
: C
1
V
1
C
f
(Equation X)
One other MEF, (61c), corresponds to its input form by a similarly obvious,
yet different, equation:
C
1
V
1
C
2
: C
1
V C
f
(Equation Y)
Item (18c), [kk], may be listed under Equation Yif the input is to be taken as
[kjk] (see above); if the input form is [t
klt], this MEF is anomalous as is

(14b), [bm].
InputORF equations
Since the ORFs are not the chief object of study, their analysis along the lines set
out above is not undertaken here; most of the necessary equations are very
straightforward, and all except one are of no pertinence to a study of the
experimental forms. For (11) and (68), however, the ORFs [bsak] and [kas]
require a special equation:
C
1
V
1
C
2
C
3
: C
1
V
1
C
2
aC
3
(Equation Z)
The choice of strategy
It was noted in the data section above that BOFs observed in W4 totaled twice as
many as the BEFs observed during the same week. Why should C have applied
his strategy to only about one-third of the input forms to which it could have
been applied? If all or most of the words represented by BOFs had already been
learnt before the experiment began, surely C would have been well enough
versed in the business of dealing with bisyllabic words, and would not have
required any strategy at all. Following Drachman (1973), we may call recourse
to strategies a form of avoidance (see also Ferguson, Peizer and Weeks 1973;
Ferguson and Farwell 1975); but inspection of the items in Appendices I and II
does not suggest the particular reason for the avoidance suggested by
Drachman, viz. a potential abundance of homonyms. Rather, it may be that
the number of polysyllabics became, at a certain stage, overwhelming; and/or
that the special strategy developed from a more normal process (see below);
and/or that the particular phonetic problems posed by some polysyllabics
proved too difcult a barrier. Whatever the impulse for choosing a strategy,
we have here an excellent example of variation along the lexical parameter
(Menyuk 1971; Hsieh 1972; Ferguson and Farwell 1975).
Potentially more answerable than the question why C had recourse to any
strategy at all, or why he applied it to some words and not to others, is the
question why the canonical shape CVjVC should have been chosen. Three
reasons are suggested; perhaps all three were involved:
(1) Familiarity. Not only was C used to producing words like this (note that 25
percent of the BOFs in Table 7.1 are of this shape); he was also used to
hearing many other common household and childhood words with medial
[j] in the English of his parents.
(2) Ease of articulation. It can be argued that, given the task of devising a
bisyllabic form that would require the least articulatory effort and yet would
still be irreproachably bisyllabic, the optimal medial segment should be a
glide. A form with less constriction in this position would tend towards the
shape CVVC (which is the shape of the MEFs!); a form with more constric-
tion would require greater effort. Of the glides available, [j] and [w], the
former was chosen, perhaps for the reasons given in (1) and (3). In Equations I
and II the [a] which is apparently supplied (so to speak) out of thin air to
complete the bisyllabic form may be a recapitulation of the normal choice of
[a] as the childs rst phonological vowel (Jakobson 1941: 478).
(3) Substitution for liquids. It is argued in the following section that Equation
B may well represent one particular (and very common) strategy, viz. the
substitution of [j] for [r], and perhaps also for [l]; and that this substitution
was employed not only in the period after the experimental strategy came
to an end, but also in the preceding period. If this is so, then the whole
idiosyncratic strategy may have developed as an extension of the substitu-
tion process.
Substitution?
The BEFs [kajat] for carrot and [pijow] for pillow strongly suggest the process
of simple substitution; for the replacement of [j] for medial liquids is very
common in child phonology, and the other phonetic adjustments are minor. If
it were certain that this was indeed the case, all the forms involved should be
treated separately and indeed passed over quickly.
It is clear, however, that the items in the data covered by Equations AI, AII,
and AIII are not to be considered as candidates for any explanation based on
substitution, without recourse to obviously contrived explanations. The items
covered by BIII and BII are more amenable to this sort of explanation, although
these too are rather forced. Indeed, if the general explanation of Cs strategy also
accounts for the items covered by BII and BIII, then it seems reasonable to
accept this strategy in toto for all the items concerned.
On the other hand, substitution must seriously be considered as a likely
explanation for all items covered by BI, where the order and representation of
both vowels and consonants is quite straightforward. The matter is, however, far
from simple; a number of other points must be considered, as follows.
(1) Most of the items which show two different BEFs corresponding to the
same input form, where one BEF ts Equation BI and the other does not, are
characterized by the fact that the BEF which ts an equation other than BI
was noted later than the one which does t that equation. This suggests
that if the BI forms are explicable as being the result of substitution the
idiosyncratic strategy developed after, and perhaps from, a substitution
process.
(2) The idiosyncratic strategy, whatever it was, appears to have been rmly
entrenched in W3, cf. (18a), (32), (34), (38), (53), (58a), none of which can
be examples of substitution; since the strategy was employed this early, it
may be valid for all subsequent forms.
(3) C was not heard to pronounce any BOFs with medial liquids during W4;
this suggests that, whatever his methods of dealing with bisyllabics with
other medial consonants, C was at this stage substituting [j] for [r] (cf. the
BEFs for (234), (28)) and perhaps also for [1] (cf. the BEF for (66a)).
(4) The weeks just prior to the end of the experimental period are also infor-
mative. Table 7.2 shows that BEFs corresponding to input forms with
medial liquids were replaced with ORFs at a relatively late stage. ORFs
with medial liquids were initiated by [lat] in W8, which was followed by
[gras] in W10; these were the rst words ever observed in Cs speech with
medial liquids. Further, the last two BEFs corresponding to input forms
with medial non-liquids occurred in W9; both were exceptional (the idio-
matic (3), and the form [kaja], (13), which could be a regressive idiom
in Ferguson and Farwells terms (1975: 432), i.e., perseverac (Ohnesorg
1959: 74)). The last regular BEFs corresponding to input forms with
medial non-liquids, therefore, belong to W7 half-way through the
experimental period! On the other hand, BEFs corresponding to input forms
with medial liquids were noted in W9, W11, even in W12. All this suggests
that C, who in W10 had begun to pronounce ordinary forms with medial
[r], was still nding difculty with this type of word.
(5) Finally, the substitution of glides for liquids is very common in child
phonology. A survey was made of 52 works in the available literature in
which [j] was reported as a substitute for medial consonants; the works
covered monolinguals in 15 different languages and bilinguals in 3 different
linguistic situations. Overall, [j] occurred as a substitute for medial conso-
nants in the following numbers of reports: [11 ] 39, [r ] 31, [] 9,
[z] 8, [n ] 6, [v] 5, [s] [ts] [h] 3, [] [t] [g] [b] [] 2, [d] [t] 1. As
far as English is concerned, [w] is far more frequent than [j] as a substitute
for medial liquids (see e.g., Snow 1963; Edwards 1971; Kresheck, Fisher
and Rutherford 1972); but if an English-speaking child such as C does use
medial [j] as a substitute, it is more probable that he is substituting this
consonant for liquids than for other types of consonant.
In summary, we adopt the position that C was probably not relying on a single
strategy for all the BEFs in the data, but that he was rather employing two or
more strategies substitution, and the idiosyncratic one which is discussed
below in apparently haphazard fashion. The idiosyncratic strategy may,
indeed, have developed from the process of substitution; and when it was
abandoned, substitution seems to have persisted. We thus accept the probability
that BEFs which come under Equation I were the result of substitution, as long
as they correspond to input forms with medial liquids; this amounts to eight
forms, (2b), (5), (22a), (23), (28), (45), (54a), (67). We must emphasize that
substitution is to be rejected as improbable (and in very many cases as highly
improbable) for all the other items in the data; and also that these forms may
Table 7.2. Patterns of replacements of BEFs by ORFs
Medial consonant of ORF
Week Cluster Stop Nasal Fricative 1 r
4 0 1 0 0 0 0
5 0 0 1 0 0 0
6 1 1 1 0 0 0
7 0 0 0 1 0 0
8 0 1 1 2 1 0
9 1 1 0 1 0 0
10 3 3 2 0 0 1
11 0 3 0 0 0 1
12 2 7 5 6 0 3
13 0 0 0 4 3 3
Total 7 17 10 14 4 8
(under Equation B) be accounted for by the interpretation suggested below for
Cs experimental strategy as a whole. We therefore proceed with our analysis,
omitting these eight items from further consideration.
Interpretation
Concerning underlying forms
What follows is based upon the approach to child phonology developed by
Ingram (1970, 1974a, 1974b). In this approach, as we understand it, allowance
is made for potential disparity between forms at four stages, as follows: the
adults spoken form, the childs perceived form, the childs underlying form
(UF), and the childs spoken form. Probably the most signicant of the three
types of potential disparity is that between the childs underlying form
(in Ingrams terms) and the childs spoken form i.e., the disparity caused by
articulatory factors. Our interpretative remarks, below, are therefore based on
the possibility that the UF may be different from the output form. Since,
however, we do not regard it as proven that other (i.e., perceptual and retention)
factors are totally negligible, we also allow for the UF to be potentially different
from the input form too. In what follows, then, we attempt to specify as much as
possible about Cs UFs on the basis of the available data (but no more than what
these data tell us), on the understanding that these UFs may be (but are not
necessarily) different from both the input and the output forms.
Homonyms
There are nine sets of homonyms in the data. It may be argued that, since the
input forms in each case are all different, while the output forms are articula-
torily identical, the UFs for the members of each set are different. The situation
is a complex one, and we argue elsewhere that instrumental and experimental
evidence is required for these arguments to be viable (Priestly 1980). Assuming,
however, that this evidence may be of value, we may inspect the homonymic
data. Most of the data must be discarded,
1
and we are left with four pairs of
homonyms: (11) & (38); (13a) & (15); (25) & (26); and (31) & (42a). It can be
seen that, in each set, the input forms and the BEF are bisyllabic; they all have
the same C
1
; they all have a low V
1
; and they all share one other consonantal
correspondence, by Equation A or Equation B. The lowest common denomi-
nator for all four sets, then which may be taken as an estimate of the minimal
content of the UFs is C
1
+ V
1
+ C
n
+ , where C
n
means another consonant,
perhaps the most noticeable one, and means another syllable. The minimal
content of each UF may then be estimated:
(11) b
skt {b + V
low
+ k + } bajak
(38) blkt {b + V
low
+ k + } bajak
(13a) tjg {t + V
low
+ g + } tajak
(15) t
kij {t + V
low
+ k + } tajak
(25) k
bd {k + V
low
+ d + } kajat
(26) k
vd {k + V
low
+ d + } kajat
(31) fwntn {f + V
low
+ n + } fajan
(42a) nl {f + V
low
+ n + } fajan
Note that the C
n
for (13a), and (256) is set up on the understanding of normal
obstruent-devoicing in nal position; and that these UFs (enclosed in braces) are
minimal estimates the UFs for (38) and (42a) may well include the [1] in the
initial cluster, for example.
Coincidences and reversions
In discussing strategies, above, we postponed analysis of relationships between
two output forms for the same input form. We now suggest that only under
certain chronological conditions may a direct relationship between two such
forms (and identity of their UFs) be postulated. We impose the following
limitation on the time-factor: if an experimental and a non-experimental form
were observed during the same week, they are treated as contemporaneous
(coincidences); and if C used an experimental form subsequently to using a
non-experimental form, it may be said that he reverted to his experiment
(reversions). In both cases, a direct relationship, and UF identity, are postulated.
In each case, the pairs of forms are inspected for minimal shared content, as in
the preceding section, and the minimal UF for the two forms is estimated as
follows:
Coincidences
(32) BEF [fajam], ORF [f
m] {f + V
low
+ m + } (W6)
(57) BEF [majs], ORF [m
st] {m+ + s + } (W6)

(33) BEF [sjan], ORF [svn] {s + + n + } (W7)
(7) BEF [pajs], ORF [p
rs] {p+ + s + } (W12)

(22) MEF [kt], ORF [krt] {k + + t + } (W12)
(54) MEF [mn], ORF [mln] {m + + n + } (W12)
Reversions
(34) ORF [f
k] (W4), BEF [fajak] (W6) {f + V

low
+ k + }
(12) ORF [bnan] (W5), BEF [bajan] (W6) {b + a + n + }
(18) ORF [kkat] (W6), BEF [kajak] (W9) {k + a + k + }
(66) ORF [lt] (W8), MEF [t] (W10) { + t + }
(61) ORF [rbt] (W9), MEF [rp] (W10) {r + + b + }
(45) ORF [grjl] (W10), BEF [gijal] (W12) {g + V
high
+ l + }
Although one might wish to discard some of these items (e.g., (18); see the
discussion of idiomatic forms, above), the general picture is clear: again, the
lowest common denominator in all these pairs may be expressed by the formula
C
1
+ V
1
+ C
n
+ ; note that now the symbol is also used to represent the extra
component of length in the MEFs. Some of the details in the formulation (e.g.,
the UF initial consonant for (34), or some of the vowels) may be arbitrary, but
the overall strategic UF is quite apparent.
Minimal input used
As a third approach to interpreting the data, we collate the combined corre-
spondences as displayed in Table 7.1, to determine the minimal amount of
information shared by them all; this is set out in Table 7.3.
Again, it is clear that all the equations share the consonant and vowel of the
initial syllable, plus one other noticeable consonant. In addition, all the BEFs
have the extra syllable, and all the MEFs have the extra component of length;
we may again propose the formula C
1
+ V
1
+ C
n
+ as the canonical UF for all
experimental forms, and also for the two aberrant ORFs (11), (68). Clearly,
many of the UFs will contain information extra to this; but all will share this
general shape.
The strategy and its component ruses
Cs strategy can be summarized, therefore, as comprising at least the following.
Of the input forms, the initial consonant and the rst vowel (this last sometimes
perhaps in a simplied form), one other noticeable consonant (either C
2
or
C
f
), and the fact that the word was of more than one syllable, were internalized.
All the output forms represent realizations of these parts, which were formed
into a phonological whole according to one of a number of component
substrategies, or ruses, as follows: () the fact of bisyllabicity () was realized
as an extra syllable, viz. either CVor VC, where C was [j] and V was either a
representation of the second vowel of the input, or was the neutral vowel [a] or
[]. This was the major ruse, and produced all the BEFs in the data (which may
be further subdivided into categories according to Equations A and B, I, II or
Table 7.3. Inputs to equations
Equation C
1
V
1
C
2
V
2
C
f
BEFs AI
AII
AIII
BI
BII
BIII
MEFs X
Y
(), (68) Z
III); (2) the bisyllabicity () was realized as a lengthening of the underlying
vowel. This (minor) ruse produced the MEFs, which may also be subcatego-
rized according to Equations A and B; and (3) was realized as the vowel [a]
which was inserted between the two consonants which constituted the post-
vocalic cluster in the UF for (11) and (68).
No suggestions are offered concerning the choice of CV vs. VC in the major
ruse, i.e., the one involved in the BEFs; in other words, we see no reasons to explain
why, e.g., (48) [wmn] was represented as [wajum] rather than *[wujam], or why,
e.g., (36) [swld] was pronounced [sjat] and not *[sajt] or *[sajowt].
It may be pointed out that the glide [j] may be interpreted as a predictably
inserted barrier between the two vowels, which would otherwise be in hiatus. The
only difference between the BEFs and the MEFs (apart fromthe particular choice
of vowel) would then be no more than insertion vs. non-insertion of this glide.
It is of course possible to formalize the strategy and its ruses by recourse to
sets of (ordered or unordered) rules. This will not be attempted here at any
length; the following examples, however, include some suggestions as to
possible ordered rules that might be incorporated:
(47) (Equation AI): [wsl] {w+ + s + } ( = l or l
?)
Here, u (the velarization of the lateral being incorporated in the V, cf. the
ORF [wisu]) giving w + + u + s; after glide-insertion, [wijus].
(2a) (Equation AII): [p
low] {p+ + 1+ }
Here, the neutral vowel a, giving p + + a + 1; after glide-insertion [pijal].
(27) (Equation AIII): [kwst] {k + ow + s + }
For all Equation III items, (which is realized as the neutral vowel in this
instance) is inserted before the V
1
of the UF: k + a + ow + s; after glide-
insertion, [kajows].
(56) (Equation BI): [m
sk
ks] {m + V
lowround
+ ks+ } ( = )
Here, the UF may well be represented as including a fully specied []; we give
only the minimal content of the UF. is inserted after V
l
, giving m + V
low round
+ + ks; after glide-insertion (and, presumably, adjustment of the V
1
) [majks].
In this case, other routes suggest themselves, e.g., the [ks] of the output may
perhaps correspond to the medial [sk] of the input, and metathesis may be
involved.
(60) (Equation BII): [rkdz] {r + + z + }
Here, the neutral a, inserted after V
1
: r + + a + z; after glide-insertion and
nal devoicing, [rjas]. The UF may in addition include the [d] of the input,
which is elided in the ORF for this item.
(70) (Equation BIII): [
rnd] { + + d+ }
The UF consonant is suggested arbitrarily. Any form from the complete cluster
[nd] to the simple [d] would surely have the same outcome: + a + + d [ajt]
(22) (Equation X): [krt] {k + + t + } and
(61) (Equation Y): [rbt] [r + + b + }
For the MEFs, a component of length, which can be represented as a
reduplication of the V
1
: k + + + t [kt] and, with nal devoicing, r + +
+ b [rp],
(11) ORF (Equation Z): [b
skt] {b + V
lowround
, + sk + }
The a is inserted, here and in (68), between the two consonants: b + V
lowround
,
+ s + a + k [bsak].
A great deal of the above is suggested very tentatively, and some of the
formulations are arbitrary. In particular, the glide-insertion might well be replaced
with another formulation. Note also that some reference to stress may also be called
for; see the next section. Other examples could be adduced: thus (44), with
consonantal metathesis, and (58b), with vocalic metathesis, provide interesting
exercises in formalization; as does (55), where the nal [] of the output seems to
represent a fusion of the [g] plus the [n] of the input. (17) is especially puzzling: the
expected BEF, given the available ruses, would surely be *[dajowt] or *[dowjat].
Conclusion
Stress
It was pointed out, in discussing the consonant equations, that while the analysis
was concerned with syllable position, the stress of the vowels might in fact be
the deciding factor. Since we are dealing with English, most of the forms have
the stress on the rst vowel; the only exceptions are the rather uninformative (8),
(12), (16) and (62). It must therefore be emphasized that the canonical UF
for the experimental forms may well with equal validity be formalized in terms
of vowel stress rather than vowel position, thus: C
1
+ + C
n
+ . For arguments
concerning the importance of stressed syllables as salient portions in child
language, see, e.g., Kiterman (1913), vakin (1948: 103), Ervin-Tripp (1966:
71), Blasdell and Jensen (1970).
Noticeable consonants
The notation C
n
was used above to refer, rather vaguely, to what is called the
most noticeable consonant. The implicit assumption is that whenever C made
a choice between medial and nal consonants of the input for his canonical UF,
some kind of non-random choice was involved. It could be suggested that C
would, for example, choose the acoustically most noticeable consonant; or that
he would choose the one which was phonologically most marked; or that he
would choose the one he had learned best. Other phonological factors may also
be involved; the following remarks are therefore tentative. If uninformative data
are ignored (words with no nal consonant, and hence no choice; words with
similar or identical medial and nal consonants; and words with medial liquids,
where substitution may be involved), 21 items remain. Of these, 4 show
vacillation between Equations A and B. Of the remainder, 9 follow Equation
A, and 8 followEquation B. The facts may be summarized as follows (here, > is
used to mean is chosen in preference to): Equation A: s > n; sk > t; z > nt; k >
t; s > l; m > n; ts > n; z > k; ks > nt . Equation B: t > n (3 times); 1 > nd; d > b; d >
v; k > d; dz > k. This shows that, in general, consonant clusters are more
noticeable than single consonants; and that the strident fricatives are more
noticeable than other single consonants. The evidence with regard to other
parameters is negligible or contradictory, although nasals appear to be generally
less noticeable than other consonants.
The syllable as a basic unit
The data presented and analysed here give strong support to the position that the
syllable should be regarded as the basic unit in phonological acquisition, if not
also in the phonology of those for whom acquisition is complete. Here the
ambiguous position of the ubiquitous [j] must be discussed (see data section,
above): is this sound, as it occurs in all the BEFs, to be regarded as an offglide to
the rst vowel, or as the onset to the second syllable? The evidence appears
contradictory: on the one hand, the glide clearly seems to have developed
(in its use as the automatic medial in the BEFs) from a substitute for liquids
(see the discussion of substitutions, above), and thus to have a clear consonantal
origin. On the other hand, comparison of the MEFs with the BEFs suggests that
the glide is an automatic insertion to break the hiatus between the two vowels
and hence that it was not a component of the UFs at all (see the discussion of
strategies and ruses, above). Whatever its role, however, it is clear that the most
fundamental part of Cs strategy was the recognition of the polysyllabic nature of
the input and the manifestation of this fact as a bisyllabic form or, infrequently,
as a monosyllabic form with extraordinary length. We suggest that the rst
opposition learnt by the child, i.e., that between C + V and zero (implicit in
Jakobson 1941), is the foundation for a syllabic learning of phonology: when the
child perceives the importance of the contrast between monosyllabic and bisyl-
labic forms, he learns to produce the extra syllable in some fashion or other by
reduplication in some cases, by a fair approximation in others, and/or by resorting
to an unusual strategy, as in Cs case. Arguments for this syllabic viewpoint date
back at least to Otuszewski (1897: 201), and have been renewed recently by,
e.g., Mike and Vlahovi (1967), Ladefoged (1967: 149), Moskowitz (1970:
43941; 1972: 515, 602), Menyuk (1971: 5971; 1972: 1163), Waterson
(1971: 2067), Drachman (1973: 146; 1975), Ferguson and Farwell (1975), and
Ingram(1974a, b). It may be pointed out that Cdealt with the syllables of the input
forms in a twofold manner. The rst (or, the stressed) syllable was highlighted,
and reproduced in a manner quite appropriate to the stage of development then
reached; the other syllable was dealt with quite cursorily and in a very general
way by recourse to either lengthening of the UF vowel, or by insertion of a
combination of the simple glide and (in many cases) the maximally open vowel.
Finally, the few MEFs which were observed, and which sounded apart fromthe
extra degree of length extremely similar to a reading of the postulated UFs, were
manifestations of an alternative ruse and were apparently employed as a kind of
last ing before the strategy was abandoned altogether.
note
1. We discard the homonyms (3) and (8) because (3) is dubious; (53), (57a), (58a) for
similar reasons, cf. the discussion of idiomatic forms; (5) and (6), (23) and (24),
because one member of the pair ts BI; (22a), because it also ts BI; and (30a) and
(34) because the vocalic correspondence in (30a) is not regular.
References
Berko, J. (1958). The childs learning of English morphology. Word, 14, 15077.
Blasdell, R. and Jensen, P. (1970). Stress and word-position as determinants of imitation
in rst-language learners. Journal of speech and Hearing Research, 13, 193202.
Drachman, G. (1973). Some strategies in the acquisition of phonology. In
M. J. Kenstowicz and C. W. Kisseberth (eds.), Issues in phonological theory, pp.
14559. The Hague: Mouton.
(1975) Generative phonology and child language acquisition. In W. U. Dressler and
F. V. Mare (eds.), Phonologica 1972. Akten der zweiten Internationalen
Phonologie-Tagung. Munich: Fink.
Edwards, M. L. (1971). One childs acquisition of English liquids. Papers and Reports
on Child Language development, 3, 1019.
Edwards, M. L. and Garnica, O. (1973). Phonological variation in imitated and sponta-
neous utterances. Papers and Reports on Child Language development, 5, 78.
Ervin-Tripp, S. M. (1966). Language development. In L. W. Hoffmann and
M. L. Hoffmann (eds.), Review of child development research, pp. 55105.
New York: Russell Sage Foundation.
ition. Language, 51. 41939. Reprinted in this volume as Chapter 4.
Ferguson, C. A., Peizer, D. B., and Weeks, T. E. (1973). Model-and-replica grammar of a
childs rst words. Lingua, 31, 3565.
Grammont, M. (1902). Observations sur le langage des enfants. In Mlanges linguisti-
ques offerts A. Meillet. Paris: Klincksieck.
Hsieh, H-I. (1972). Lexical diffusion: evidence from child language acquisition. Glossa,
6, 89104.
Ingram, D. (1970). Some suggestions on the role of systematic phonemics in child
phonology. Papers and Reports on Child Language Development, 1, 4355.
(1974a). Phonological rules in young children. Journal of Child Language, 1, 4964.
(1974b). Fronting in child phonology. Journal of Child Language, 1, 23341.
Jakobson, R. (1941). Kindersprache, Aphasie und allgemeine Lautgesetze. Uppsala:
Almqvist & Wiksell.
Kiterman, B. (1913). Opyt izuenija slogovoj lizii v detskom jazyke. Russkij
lologieskij vestnik, 69.
Kresheck, J., Fisher, H., and Rutherford, D. (1972). A study of r-phones in the speech of
three-year-old children. Folia Phoniatrica, 24, 30112.
Ladefoged, P. (1967). Three areas of experimental phonetics. London: Oxford
University Press.
Ljamina, G. M. (1958). K voprosu o mexanizme ovladenija proiznoeniem slov u detej
vtorogo i tretego goda izni. Voprosy psixologii, 6, 11930.
McNeil, J. and Stone, J. (1965). Note on teaching children to hear separate sounds in
spoken words. Journal of Educational Psychology, 56, 1315.
Menyuk, P. (1971). The acquisition and development of language. Englewood Cliffs, NJ:
Prentice-Hall.
(1972). Clusters as single underlying consonants; evidence from childrens produc-
tion. In Proceedings of the Seventh International Congress of Phonetic Sciences,
pp. 11615. The Hague: Mouton.
Mike, M. and Vlahovi, P. (1967). Glasovna stuktura KVKV u razvojnom procesu
asociativnog sistema glasova. Prilozi prouavanju jezika, 3, 189203.
Moskowitz, B. A. (1970). The two-year-old stage in the acquisition of English phonol-
ogy. Language, 46, 42641.
(1972). The acquisition of phonology and syntax: a preliminary study. In K. Hintikka
et al. (eds.), Approaches to natural languages, pp. 4884. Dordrecht: Reidel.
Ohnesorg, K. (1959). Druh fonetick studie o dtsk ei. Bratislava: Brno University.
Otuszewski, W. (1897). Die geistige und sprachliche Entwicklung des Kindes. Berlin:
Fischer.
Paesov, J. (1968). The development of the vocabulary of the child. Bratislava: Brno
University.
Priestly, T. M. S. (1980). On homonymy in child phonology. Journal of Child Language,
7, 41327.
Snow, K. (1963). A detailed analysis of articulation analyses of normal rst grade
children. Journal of Speech and Hearing Research, 6, 27790.
Suxanova, N. V. (1968). Otnoenie razvitija vtoroj signalnoj sistemy v spektralnoj
kartine detskoj rei. urnal vysej nervnoj dejatelnosti imeni Paloa, 18, 9015.
vakin, N. X. (1948). Razvitie fonematieskogo vosprijatija rei v rannem vozraste.
Trudy instituta psixologii, 13, 10132.
Appendix I Bisyllabic ordinary forms noted in week 4
A. By medial consonant
Medial [j]: ljn, dj, wjl, sjel, j, fj, wj, gj, hj, fjl, sjl, wjl,
njl, jn, bj, nj, tj
lion, deer, whale, seal, ear, re, wire, gear, here/hear, le, stile, wheel, nail,
iron, beer, near, tire
Medial stop: ddi, g
gi, fki, k
tn, k
kn, ppi, h
pow, ddi, kk, bjkan,

wd, pl, p
ti, bt, tdi, bdi, mgi, fwt, d
g, p
ki, p
p, fp, rdi,
ttut, pk, bt, ljdi
Daddy, doggy, Suki, kitten/kitchen, kitchen/chicken, puppy, hippo, Deedee,
cracker, bacon, water, apple, potty, butter, Teddy, Buddy, monkey, sweater,
digger, picky, popper, supper, ready, toot-toot, paca, better, lady
Medial nasal: mmi, nn,
ni, n
ni, hm, mni, hni, fni, wni, mn

Mummy, Nana, Ernie, Noddy, hammer, money, honey, funny, Wendy, mixer
Medial fricative: krs, psi, k
, lziz, szi, szz, rs

Christopher, Priestly, coffee, lizzies, Susie, scissors, rooster
Medial cluster: njn, mgow, bmbow, bmp
Indian, Margot, Jumbo, Grampa
Medial [w]: fw
ower/shower
(Totals: Stop 28, [j] 17, Nasal 10, Fricative 7, Cluster 4, [w] 2)
B. By closure of second syllable
Words with open second syllable (CVCV, VCV, CCVCV, CVCCV). Total: 49
Words with closed second syllable (CVCVC, VCVC, CVCCVC; n.b. includ-
ing words ending in [w]). Total: 19
Appendix II Data
Input BEF Output MEF ORF Lexical item
(1) pjnt pijat 4, 8 pjnat 9 peanut
(2) plow
a pijal 7
b pijow8
( )
plow 13 pillow
(3) pjst pija 4 pjs 12 (tooth)paste
(4) pnd pajan 4 pnd 9 panda
(5) prt pajat 5, 9, 11 prt 13 parrot
(6) pwd pajat 4 pwd 11 powder
(7) p
rid
{
(a) pajas 5
}
(b) pajs 12
p
rs 12 porridge
(8) pljsk
pija 4 pljsk
13 police-car
(9) brijz bjas 6 brijs 13 berries
(10) bjsn bajas 4 bjsn 12 bison
(11) b
skt bajak 7 bsak 12 basket

(12) bn
n bajan 3, 4, 6 bnan 5 banana

(13) tjg
a tajak4
b taja 4; 6
c kaja 9
8
>
<
>
:
9
>
=
>
;
tjg 10 tiger
(14) twbn
a bajam5; 7
b b m7
( )
twbn 12 Tobin
(15) t
kij tajak 4 t
kij 12 turkey
(16) tm
row pajm tomorrow

(17) dwn
t djawt 3, 7 dn
t 8 doughnut
(18)
t
kt
ka
k
( )
a kajak 3; 6; 9
bkjak7
8
<
:
c k k 7
)
ka
kat 6
ka
kt 9
8
>
<
>
:
9
>
=
>
;
chocolate
(19) djzs dijas 4 djzs 13 Jesus
(20) d
ndbrdmn bijan 7 dnmn 12 gingerbread-man

(21) dnf djan 7 d
n 12 Jennifer
(22) krt
a kajat 5; 9
(
b k t 12
)
krt 12 carrot
(23) krl kajal 3 krl 12 carol
(24) kndl kajal 3 kndl 12 candle
(25) k
bd kajat 5, 9 p
bd 12 cupboard
(26) k
vd kajat 5 k
vd 13 covered (wagon)
(27) kwst kajows 5 kws 8 coaster
(28) gr gajas 4 gras 10 garage
(29) g
bdbg bajak 7 bgd 12 garbage-bag

(30) fg
a fajak 5
b fijak 4; 7
( )
fg 10 nger
(31) fwntn fajan 6 fwtn 12 fountain
(32) f
m fajam 3, 6 f
m 6 farmer
(33) svn sjan 7 svn 7 seven
(34) s
k fajak 3, 6
f
k 4
s
k 12
( )
sucker
(35) swld sjat soldier
(36) wld sjat shoulder
(37) prznt pjas 4 przt 13 present
(38) blkt bajak 3, 6 bkt 11 blanket
(39) brnd bjan 4, 7 bnd 10 Brenda
(40) drgn
a dajan 3
b dajak 4
( )
dgn 10 dragon
(41) krsms kijas 4 kss 12 Christmas
(42) nl
a fajan 6
b fajal 6
( )
fnl 10 annel
(43) spjd bajat 4 bjd 8 spider
(44) strjm mijat 4 djm 12 streamer
(45) skwrl gijal 12 grjl 10, 13 squirrel
(46) wsk wijak 4 ws 12 whisker
(47) wsl
a wijas 4
b wijus 6
( )
wsu 12 whistle
(48) wmn wajum 6 wbn 12 woman
(49) hdjk hajak headache
(50) h haja 4 h 12 hanger
(51) hjdrnt hajat 7 hjdt 12 hydrant
(52) mnt mijat minute
(53) mtsn mjas 3 mtsn 12 medicine
(54) mln
amjan7
(
bme n12
)
mln 12 melon
(55) mw
gn maj 6 mwtg 8 mouth-organ

(56) m
sk
ks majks 7 mks
ks 10 musk-ox
(57) m
nst
{
(a) mjan 5
}
(b) majs 4
(c) majs 6
m
st 6 monster
(58) mjuwzk
a mejas 3
b mijus 6
( )
mzk 12 music
(59) lzd zijan 6 lzt 12 lizard
(60) rkdz rjas 3 rkas 10 records
(61) rbt
a rajat 4
b rajap5; 7
(
cr p10; 11
)
rbt 9 rabbit
(62) rajn
srs
{
(a) rajas 4
}
(b) rajs 7
rjn
s 12 rhinoceros
(63) r
n rajan 7 rn 10 runner
(64) jmjuw
a ijumum6
b ijum6
( )
jmju 12 emu
(65) ndn jan engine
(66) lfnt a jat 4; 6 b t10; 12 f g lt 8 elephant
(67) rjl jal 5, 12 rijal 13 aerial
(68) ksdnt ajak 5 kas 11 accident
(69)
sk jas 5
s 8 Oscar
(70)
rnd
a ajat 6
b aj
t 7
( )

rn 11
rdz 12
( )
orange
8 Phonological reorganization: a case study
Marilyn M. Vihman and Shelley L. Velleman
Introduction
In 1975 Ferguson and Farwell analyzed the initial consonant use in the rst fty
words of three English-speaking children in an effort to identify the primary
characteristics of childrens early sound production in words. They began with
the assumption that both initial consonants and words are valid units of early
child phonology, but ended by concluding that much of early phonology is
word-based. Since that time, that conclusion has come to be widely accepted
(see e.g., Grunwell 1981; Menyuk, Menn, and Silber 1986; Studdert-Kennedy
1987; MacKain 1988).
If we see phonological development as beginning with pre-systematic, whole-
word-based productions, we must then ask howthe child proceeds fromthat point
to the orderly, segment-substitution-based phonology described for older chil-
dren, the hallmark of which is said to be its systematicity (Oller 1975; see also
Smith 1973; Ingram 1986). The rst ve to ten words may show little phono-
logical interrelationship. Jusczyk (1986) hypothesizes that the earliest word
recognition network may be restricted to words stored as separate entities with
no particular organization (p. 14). In production, similarly, there is little evidence
of even incipient phonological organization in the formof the rst fewwords (see
Waterson 1978 and also Vihman 1987, which lists the rst ve to six words of
each of twenty children learning six languages). As more words are added to the
lexicon, however, we begin to see a decrease in the number of individual phones
used in limited lexical contexts as fewer, broader phone categories are used in
larger classes of lexical contexts (Leonard, Newhoff, and Mesalam 1980; Stoel-
Gammon and Cooper 1984). We also see the emergence of the rst idiosyncratic
rules or word recipes which restrict the childs rst sound congurations
(Ingram 1974; Menn 1979), and the rst signs of phonological behavior which
imply some awareness of segment (Bleile 1986).
The relationship between these earliest detectable signs of systematization
and later segmental phonological systems is not yet well understood. We do not
yet know whether children acquire segmental phonology suddenly and/or
This study was supported in part by funding from the National Science Foundation (BNS 7924167
and 8520048).
238
across the board (i.e., all phonemes in all word positions), or whether segmen-
tation gradually becomes the predominant characteristic of their phonological
systems. The nature and process of emerging phonological systems must
be studied in greater detail if we are to understand how the child gets from
here to there; from whole words to segments. (Macken 1979, provided a rare
instance of a longitudinal study documenting this transition.) Furthermore,
there has been little acoustic verication of the systematization of childrens
phonologies, despite the well-known limitations on auditory transcription,
which are due primarily to listeners tendencies to hear through the lter of
the established categories of the adult language (see e.g., Stockman, Woods, and
Tishman 1971; Zlatin and Koenigsknecht 1975, 1976; Macken and Barton
1980; Maxwell 1981; Maxwell and Weismer 1982).
The purpose of the present study is to illustrate the process of reorganization
and the beginning of phonological systematization, as validated acoustically,
in the development of one particularly voluble child recorded weekly from
the age of 9 to 16 months. This childs lexical production was analyzed in
detail from the onset of word use, at 10 months, to 16 months, when she had a
cumulative lexicon, according to maternal report, of over 70 words. Both
perceptual (transcription-based) and acoustic analyses were carried out, in
order to identify and conrm the emergence of a phonological system, includ-
ing evidence of the onset of productive word-conguration patterns and of the
earliest behavior indicative of the treatment of segments as entities distin-
guishable within words.
We will document three related characteristics of the process of system-
atization in this childs production:
1. Experimentation, or the phonological variation resulting from the childs
apparent exploration of alternate solutions to production problems posed by
particular target words, given the childs articulatory constraints (Macken
and Ferguson 1981; Bleile 1986).
2. Word recipes, or the use of idiosyncratic, whole-word-sized production
patterns, sometimes involving a prosodic match to the target. The restruc-
turing of adult targets to t child output patterns provides the best
evidence of the workings of a word recipe or articulatory routine. These
recipes allow the child to expand his or her lexicon within the constraints
of a small number of possible output shapes. (See Waterson 1971; Menn
1976; Vihman 1976 1981; Macken 1978 1979 for examples of this
phenomenon.)
3. Regression, the nonlinear progression more familiar from studies of the
acquisition of morphology and syntax (e.g., Bowerman 1982), in which
early forms accurately reecting an adult model are replaced by less
advanced forms in closer conformity with the childs system (Leopold
1947).
Finally, we will consider the childs emergent system from the point of view of
the word vs. the segment as the primary unit of organization.
Phonological reorganization: a case study 239
Method
Subject
The subject of this study, Molly, was one of ten children whose language
development was followed as part of the Stanford Child Phonology Project.
Each child was audio- and video-recorded weekly at home in free play with
the mother for 30 minutes, from 9 to 16 months of age. A maternal interview
focusing on the childs progress in language comprehension, gestural and
vocal communication, and play was administered monthly (Bates, Benigni,
Bretherton, Camaioni, and Volterra 1979) and the mother was asked to maintain
a daily log in which she recorded advances relevant to the childs communica-
tive and symbolic development. The mother was also asked to so structure the
play session as to allow the child to produce any words that seemed to be in use
in the preceding week, so that the childs current lexicon would be represented
in each recording to the greatest possible extent. The number of cumulative
words reported by each mother in the larger study for the lexicon of her child
was found to be approximately twice the number produced during a given
session (Vihman and Miller 1988). Further details of subject selection and
recording procedures from the larger study are available elsewhere (Vihman,
Macken, Miller, Simmons, and Miller 1985; Vihman, Ferguson, and Elbert
1986; Vihman and Greenlee 1987; and Vihman and Miller 1988).
Mollys word production and phonology were unusual in several respects.
First, she was an exceptionally voluble child. Molly ranked rst in mean
vocalizations per session, sampled over seven sessions from 9 to 16 months
(Vihman et al. 1985). Secondly, Mollys production patterns for individual
words were remarkably consistent or stable within a given session; there was
relatively little variability from token to token of a certain word type. At
16 months Molly used the smallest number of different phonetic shapes per
word (fewer than two different phonetic shapes per word type: Vihman and
Greenlee 1987). Thus, many tokens were available for each word type, and
those tokens tended to be phonetically similar.
Furthermore, Mollys early phonology was relatively systematic as well
as stable. She attempted only a restricted set of consonants (primarily stops
and nasals: Vihman et al. 1986). Last but not least, Molly was unusual in her
early preference for nal consonant production. Twenty-three percent of her
vocalizations both words and babble included nal consonants at a time
when she had a 3050-word lexicon, as compared with a range of 4 to 19
percent nal consonants for four other children (Vihman and Greenlee 1987;
for a discussion of the rooting of this nal consonant preference in the
prelinguistic period, see McCune and Vihman 1987). Of 77 word types
recorded for Molly over seven months, 35 percent sometimes included a
nal consonant, compared to a mean, for seven subjects, of 25 percent
(Vihman and Hochberg 1986). All of these aspects of Mollys phonological
240 Marilyn M. Vihman and Shelley L. Velleman
style will be relevant to our discussion of her treatment of nal consonants
over time.
Data preparation
Each audio tape was transcribed by one of four transcribers using a ne phonetic
transcription system based upon the IPA and supplemented by a symbology
especially developed for use with children (Bush, Edwards, Luckau, Stoel,
Macken, and Peterson 1973). Reliability was tested using brief samples from
tapes of seven infants, including Molly. Agreement with respect to place and
manner in consonants, syllable shape, and vocalization length in syllables
reached 86 percent. If differences involving initial and nal glottal stops and
[h] are disregarded, reliability across the four transcribers reaches 91 percent for
the parameters tested.
The video tapes were reviewed repeatedly, once auditory transcription had
been completed, in order to determine the word status of each vocalization. Both
form and meaning were taken into consideration in evaluating word status
(see Vihman et al. 1985; Vihman et al. 1986; Vihman and Greenlee 1987; and
Vihman and Miller 1988 for discussion of the problems involved in word
identication at this age). Word tokens tentatively identied early on were
sometimes discarded from the nal analysis. This generally occurred when
one or more of the following sources of doubt about word status obtained: the
context was insufciently clear, the occurring phonetic shape was relatively
distant from the suspected adult target or was not easily distinguishable from
other vocalizations used in different contexts, or the word type in question was
used only once or in only one episode on the tape. In Mollys case, frequent,
relatively stable repetition of tokens of the same word type and, in later sessions,
increasing use of nal consonants, made word identication relatively less
problematic than for some of the other infants.
Data analysis
Approximately two sessions per month were selected for acoustic analysis,
beginning with the session in which Molly was 1;0.26 (one year and twenty-six
days old) and produced 11 different words spontaneously (11 word types) in the
course of the half-hour recording, and ending with the session in which she was
1;2.20 and produced 19 words spontaneously. Perceptual (transcription-based)
analysis covered a larger number of sessions, from 0;10.15 to 1;3.24. Mollys
mother had reported a cumulative vocabulary of 69 words by the week preced-
ing this last session analyzed. Mollys age at each session and the number of
spontaneous adult-based word types identied during that session are given in
Table 8.1.
Perceptual analysis of Mollys speech revealed that words of CVC shape
were of particular interest with respect to Mollys phonological reorganization.
Spectrograms were therefore made of all word tokens of CVCshape which were
not interrupted by environmental noise or overlapping speech. An oscillogram
was also made of each utterance for verication of patterns detected within the
spectrogram.
Results
Perceptual analysis
Words with nal nasals. Longitudinal perceptual analysis of Mollys
speech reveals the development of a nasal-nal word recipe or articulatory
routine. This routine developed through the stages described below (see
Table 8.2 and Appendix):
a. Presystematic; no pattern. In the rst two months of word production no
nasal-nal words are attempted disregarding the onomatopoeic yum-yum
and vroom-vroom, which are highly variable in the adult presentation (yum
ranges from [jm] to [], and vroom ranged from [vrm] to []). In fact,
the only nasal words recorded are mama, moo, and night-night (produced by
the mother as nigh-nigh).
At 1;0.26 Molly attempts her rst conventional nasal-nal words, actually
producing a nal nasal in at least one token of bang [ba
]. (Table 8.2
therefore begins at this point.) However, each nasal-nal word attempted is
phonologically distinct from the other in the childs production; no pattern is
evident. The child appears to have a set of unrelated forms which happen to
end similarly in the adult language.
b. Apparent increase in awareness of phonetic targets; experimentation.
At 1;1.8, Molly imitates down [t
~ t
:], using a nal velar nasal alter-

nating with glottal stop in nal position, as in her previous productions of
bang. For the rst time, two monosyllabic target nasal-nal words (bang
and down) are treated in the same way, suggesting recognition of a common
Table 8.1. Word production by age
Age (years, months, days) Spontaneous adult-based word types
0;10.15 4
0;11.9 2
0; 11.20 3
1;0.10 5
1;0.26 11
1;1.8 15
1;1.15 13
1;2.20 19
1;3.24 35
Table 8.2. Nasal-nal production pattern
Produced with nasal Produced w/out nasal
Stage Age Monosyllabic target Disyllabic target
[nal nasal (+/ stop)] [inital nasal medial nasal nal nasal]
a 1;0.26 bang (1 token) [ba
I
] balloon [by
h
]
button [b]
bang [ba]
b 1;1.8 bang [b] bang [p:]
down [t
:] down [t
]
c 1;1. 15 down [d:n] button [pa
n
n]
round [han]
d 1;2.20 around [wa:n] Ernie [hn] Brian [pan]
down [tn] Granma [nm] button [pa
n
n]
hand [han] Jennie [tni] building [pa:n]
e 1;3.24 bang [pa
n
n] Nicky [nni] camera [ka
m
m] Graham [k
n
ni] open [hp]
down [ta
n
n] Nonny [na
n
ni] piano [pa]
green [kyni]
in [
n
ni]
name [ne
m
mi]
NB: One token is cited here for each word; see Appendix for other tokens.
adult pattern. At the same time, a wide range of different endings are
experimented with for one of these words (bang), only some of which
actually include a nal nasal (the free variation between initial [p] and [b]
is disregarded here, both being noted as [b]):
[b:]
[b
]
[bk]
[b:]
[b:in]
[b
:ini]
[b
]
Note that one of these variants, [b:], is a fairly exact rendering of the adult
form. As described below, this experimentation ends in the gradual domi-
nance of one particular output pattern but not, as it happens, the most
accurate variant.
c. Production pattern dominance. At 1;1.15, one of the patterns used for
bang emerges as the preferred pattern for target nasal-nal words; this
pattern dominates within word variants and is used across nasal-nal word
types as well, extending to one disyllabic model. This preferred pattern is (C)
VN:V, in which the medial consonant varies between [n] and [], and the
nal vowel between (voiced or voiceless) [i] and [] (see the last form cited
above). This pattern is the most common production type for all three nasal-
nal words attempted: two monosyllables and one disyllable formerly pro-
duced without a nal nasal.
d. Restructuring of targets. One month later (1:2.20) we see the same pattern
in use, but the competing alternate forms no longer occur. Furthermore, other
word types are restructured in accordance with the newly established word
recipe. In addition to nasal-nal target words, target disyllabic words with
medial nasals and eventually (at 1;3.24) even one word with a target initial
nasal are adapted to the childs preferred nasal pattern. Nine words are
involved at 1:2.20: three monosyllables, three disyllables with target nal
nasals, and three disyllables with target medial nasals.
1
The change in Mollys production of words involving nasals was
striking enough to provoke a spontaneous comment by her mother at this
time. She reported that button, balloon, banana, and bunny, each of which
had had its own unique phonetic shape previously, were now all produced
as [bn:] or [ban:] a regression reecting the force of Mollys new
output pattern.
e. Other patterns emerge. At nearly 16 months (1;3.24) Molly continues to
make use of the nasal production pattern, adding camera (/kmr/, produced
as [k
m
m]) and even reforming the nasal-initial Nicky: [
n
ni] to t the
pattern. On the other hand, some nasal words are now produced without a
nasal: open ([hp]) and piano ([pa]).
Words with nal obstruents. An obstruent-nal output pattern devel-
oped in a manner which paralleled the emergence of the nasal output pattern
quite closely. The chronology of this related pattern is described below and
displayed in Table 8.3. Comparing the chronology of changes in nal obstruent
use with the stages of development of the nasal-nal production pattern, we can
characterize Mollys progress as follows:
a. Pre-systematic. Molly rst produces a nal stop in a word token at 0:10.15,
when cracker occurs in a range of variants, including the monosyllables
[kk], [w
h
k], and [kk], as well as disyllabic [pak], [tk], and
[pkwa]. In the following month block, cat, and dog are all produced as
monosyllables with no supraglottal nal consonant ([pa], [ka:], [d]). At
1;0.10 four words are produced with nal stops: baby [pe:p], good girl
[kVkVk], peek(-a-boo) [pk], and (w)oops [Vp] (V = a range of different
vowels). Thus, during this period consonant-nal words are attempted, and
monosyllabic variants with nal consonants occur in attempts to produce
disyllabic words, but no pattern is evident. (Compare 0;10.15 through 1;0.26
for nasals.)
b1. Apparent increase in awareness of phonetic targets. Two weeks later, at
1;0.26 the session in which the rst conventional nasal-nal words are
attempted Molly produces ten new words with nal obstruents (see
Table 8.3). Furthermore, a new level of attention to word endings seems
to be reected in the fact that virtually all of these words occur with heavily
aspirated nal consonants: e.g., cup [kk
h
], hot [hat
h
], teeth [tit
h
], and up
[hp
h
]. In addition, these consonant-nal words are characterized by rela-
tively little variation. In sum, there is an increase in the targeting of words of
a given type, with phonetic output patterns at 1;0.26 for obstruents similar
to those which occurred for nasals at 1;1.8, the next session analyzed.
b2. Experimentation. At 1;1.8 the session in which bang is explored through
a range of different variants Molly again produces a large number of
consonant-nal words, but the degree of variability rises sharply. Two
words in particular, hot and teeth, resemble bang in the wide range of
variants used:
hot [hat] (10) ~ [ht] ~ [at
h
] ~ [ha] (2) ~ [ha] (2)
teeth [ti
t
t
h
i] (2) ~ [titsti] ~ [tit
h
t
h
t
h
] ~ [tit] (5) ~ [tit] (2)
(Note: initial [t] = [t] or [d])
In addition to the glottal-stop-nal variants of hot, Molly produces boat
[be], cat [k
h
], and dog [ta
h
] as open monosyllables in this session. On the
other hand, the extra-heavy aspiration of teeth occurs also in an imitation of
clock: [ka
k
k
h
]. This increase in variability, especially in the production of
a few often-used words, parallels the variable production of bang, also at
1;1.8. A dialogue which occurs between Molly and her mother during the
session at 1;1.8 suggests that the child is aware in some sense that released
nal stops enhance intelligibility:
Table 8.3. Obstruent-nal production pattern
Final consonant used No nal consonant used
Stage Age Target nal C No obstruent in _# in model Target nal C
Stop Fricative Stop Fricative
labial dental velar
0;10.15 cracker [kk]
0;11.9 cat [k
h
]
0;11.20 block [pa]
dog [d]
a 1;0.10 oops [p] peek [pk] baby [pe:p]
good girl [kk
h
k]
b1 1;0.26 cup [kk
h
] bird [pt] book [pk
h
] teeth [tit
h
] pumpkin [kk] close [ko
]
up [hp
h
] hot [hat
h
] box [bk
h
]
toot [tt]
cold [kok
h
]
b2 1;1.8 hot [hat] clock [ka
k
ki ] teeth [tit t
h
i] good girl [kkkkk] boat [be]
peek [pe||k
x
e] horse [ht] cat [k
h
]
squeak [k
h
k
h
] dog [ta
h
]
tick [tt
h
]
c 1;1.15 up [p] hot [ht] good girl [ggk] hat [h] box [ba]
burp [pap
h
]
d 1;2.20 up [p] foot [pt] book [pk] bus [pt] baby [pib] nose [n:]
eat [t
] stuck [tt
h
] cheese [at
] Hooper [pt]
coat [kk] block [pt] watch [wat] apple [ap]
toot [tut] tock [t
h
aki]
clock [kak]
1;3.24 Brett [pat] block [pak] glasses [ka
k
k
h
i:] nose [no:]
red [wa::t] book [pk] house [hat] beads [pi]
that [tat] click [kik] Ruth [ht]
oink [ho:|k
h
]
peek [pik]
pig [pk]
rug [wak]
. stuck [kak]
walk [wak
h
]
work [hk]
NB: One token is cited here for each word; see Appendix for other tokens.
mother: Good stuff. Good stuff to eat.
molly: [ha]. [ha]. (hot X 2)
mother: To eat.
molly: [ha
h
]. (hot)
mother: No, I didnt say teeth; I said eat. (Laughs)
molly: [ht]! (hot)
Molly repaired her misunderstood productions of hot by over-articulating the
nal consonant, which was more in keeping with her previous (and later)
pronunciations of this word.
c. Production pattern dominance. At 1:1.15, Mollys variability decreases
some what, but less than it does for nasals in the same session (1:1.15). The
production of heavily aspirated nal consonants (CVC
h
) begins to emerge as
the preferred output pattern.
d. Restructuring of targets. By 1:2.20 production of nal consonants has
again stabilized. Aspirated release characterizes most of Mollys nal con-
sonants at this point. In addition, certain targets are restructured to t this
output pattern by 1;3.24. Substitution of a dental stop or affricate for a nal
voiceless dental fricative has become a systematic phonological process for
Molly (bus [pt], house [hat], Ruth [ht], in addition to the earlier teeth
and horse [ht]). Voiced fricatives are omitted word-nally (nose [no:],
beads [pi]). But cheese is exceptional. Like the reformation of Nicky noted
earlier, the cheese variants [at ] and [it ] appear to reect whole-word
processing. The word-initial [t ], which Molly had not yet produced else-
where, is moved to nal position, where [t ] has already occurred in some
variants of words like hot and teeth.
In summary, for obstruents, just as for nasals, a gestalt-like whole-word pattern
is now imposed on a certain number of adult words with nal obstruents. The
change in nal-obstruent patterning is less dramatic than the development of
the nal-nasal production pattern, but is nonetheless unmistakable. Before the
reorganization, nal-consonant production has an unplanned, accidental look to
it: there is little relationship between the adult shape of the different words
produced with nal consonants (e.g., baby, good girl, oops). After the stages of
pattern-discovery and experimentation, a stable production pattern emerges and
is widely applied, extending also to new types of adult models.
The development of these two production patterns is interesting not only for
the chronological parallels (summarized in Table 8.4), but also for the fact that
both output patterns involve a strategy for producing nal consonants. In fact,
both patterns involve continued airow following nal consonant constriction,
and both serve to increase the salience of nal consonants. Extension of the
obstruent-nal production pattern to cheese is fully parallel to the extension of
the nasal-nal pattern to Nicky. There is thus clear phonological evidence of a
reorganization affecting consonant-nal words during this period. The similar-
ities in the shape and function of the two patterns lead us to conclude that a
unitary phonological (re)organization has occurred.
Acoustic analysis
Acoustic analysis was used to verify the perceptions of the transcribers and the
authors. Spectrograms were made of all productions relevant to the develop-
ments described here which were relatively free of background noise, compet-
ing maternal speech, etc. (approximately 5075 percent of productions in each
category in each session; see Appendix). Wide-band spectrograms were made
on the Kay Elemetrics Corp. Digital Sona-Graph 7800, using a 500 Hz band-
width lter, and representing the frequencies from 0 to 5000 Hz. Spectrograms
of stop-nal words were measured for duration of consonant closure and for
duration of release (aspiration) following closure. Spectrograms of nasal-nal
words were measured for duration of vowel + nasal and for duration of nasal
release following the nal nasal. Because nasal-nal words were often produced
with a nasalized vowel only (e.g., [b
:]) in the rst sessions analyzed, the

durations of the vowel and the nasal could not be measured separately. In
addition, many of these early productions had no nasal release, yielding zero
values for this measure.
Figures 8.1 through 8.4 are boxplots of values obtained in sessions at 1;0.2
through 1;2.20. The central outline boxes depict the middle half of the data
(25th75th percentiles). Medians are indicated by the central line across each
box. The whiskers that extend from the top and bottom of the box represent
the extent of the main body of the data. Extreme data values are indicated with a
circle or, if very extreme, with an asterisk. (See Velleman and Hoaglin 1981 for
further information about the interpretation of boxplots.)
Figures 8.1 through 8.4 illustrate that, for all four variables (vowel + nasal
duration, nasal release duration, consonant closure duration, and consonant
release duration), Mollys variability increased substantially (i.e., boxes become
longer, whiskers extend farther) in the sessions at 1;1.8 and 1;1.15. During this
time period Molly experimented with her new production patterns and gradu-
ally applied them throughout the lexicon. Variability then decreased again
(i.e., boxes are more compact, whiskers are very short) by 1;2.20 when these
patterns had become established. Although the number of word types and
tokens for each pattern varies from session to session, the consistency with
which this increase in variability occurs across all four measures conrms the
Table 8.4. Chronology of phonological reorganization
Stage Obstruents Nasals
a. pre-systematic 1:0.10 1:0.26
b.1. emergence of pattern 1;0.26 1:1.8
b.2. experimentation 1;1.8 1:1.8
c. production pattern established 1:1.15 (?) 1:1.15
d. restructuring of targets to t production patterns 1;2.20 1:2.20
transcription-based impression of increased variability during experimentation
followed by decreased variability coinciding with establishment of new pro-
duction patterns.
Discussion
The changes in Mollys production of nal consonants over time exhibit several
of the characteristics suggested in the literature as evidence of phonological
0
1;0.26 1;1.8 1;1.15 1;2.20
Child age
V
o
w
e
l

+

n
a
s
a
l

d
u
r
a
t
i
o
n

(
m
s
e
c
)
800
1200
2000
1600
400
Figure 8.1. Vowel plus nasal duration by age
1;0.26 1;1.8 1;1.15 1;2.20
Child age
N
a
s
a
l

r
e
l
e
a
s
e

d
u
r
a
t
i
o
n

(
m
s
e
c
)
100
200
0
Figure 8.2. Nasal release duration by age
systematization: experimentation with different means of ending words, devel-
opment of a preferred production pattern (C)VN:
and (C)VC
h
, restructuring
of new target words to t the established production patterns, and apparent
regressions loss of distinction among nasal-nal words, as noted by Mollys
mother; loss of salience in nal position for obstruents during the experimenta-
tion period.
600
450
300
150
C
o
n
s
.

c
l
o
s
u
r
e

d
u
r
a
t
i
o
n

(
m
s
e
c
)
Child age
1;0.26 1;1.8 1;1.15 1;2.20
0
Figure 8.3. Consonant closure duration by age
C
o
n
s
.

r
e
l
e
a
s
e

d
u
r
a
t
i
o
n

(
m
s
e
c
)
1200
800
400
0
Child age
1;0.26 1;1.8 1;1.15 1;2.20
Figure 8.4. Consonant release duration by age
Mollys unusual phonetic consistency makes the contrast between experi-
mentation sessions and preceding and following sessions particularly striking.
Mollys productions at 1;1.81;1.15 are much more variable than at any other
time, and the patterns of unreleased nal stops and of nal nasals with nasal
releases which occur during this session are both new and of high frequency.
In addition, the restructuring of targets to t output patterns following the
experimentation sessions is quite noticeable, especially with respect to nal
nasals; the nal nasal pattern is extended to a wide variety of words (Tables 8.2
and 8.3). It is clear that some sort of experimentation and reorganization occur
in Mollys phonology at 13 months. The changes which follow this session
cannot be taken to be random or accidental events.
An interpretation reducing the changes described here to motoric maturation
can be ruled out. The new development with respect to nal obstruents at 1;1.8
is the production of a variety of unreleased or glottal-stop offsets in words of
the shape CVC. This pattern is abandoned in the next session. Locke (1983)
presents data which suggest that American adults fail to release nal voiceless
stops about 36 percent of the time in conversational speech, despite a resultant
loss in intelligibility, and that young children omit or replace nal voiceless
stops by glottal stops up to 50 percent of the time (pp. 228ff). These data imply
that Mollys new pattern at 1;1.8 is articulatorily easier than is her previously
and ultimately preferred pattern of released nals. Therefore, the glottal-stop
and vowel-nal productions of forms with obstruent-nal targets cannot be held
to reect a phonetic advance over the productions with an aspirated stop release.
In the case of the nasal-nal pattern, the phonological reorganization is clear.
A pattern never used before the experimentation session one involving nasal
releases is tried out along with a variety of other patterns during that session,
and is clearly dominant in the following sessions.
2
In the case of obstruents, Molly consistently produces aspirated nals both
before and after the experimentation at 1;1.8. The change is that the pattern
becomes more productive following the childs experimentation with other
(phonetically simpler) means of approximating nal consonants. The fact that
Mollys CVCs are phonetically the same before and after 1;1.8 does not contradict
the claim that a phonological change occurred; it merely obscures the process.
Both the experimentation at 1;1.8 and the increased productivity thereafter are
evidence of phonological systematization in the case of obstruents as well as
nasals.
Furthermore, it is probably not a coincidence that nal obstruents and nal
nasals undergo reorganization at the same time. Acoustic analysis suggests that
the two patterns are related. In both cases, nal consonants are given a strong
vowel-like release in the form of heavy aspiration for obstruents and of often
voiceless nasal releases for nasals. The articulatory sequence for both involves
continued airow following the release of the consonant constriction. If voicing
is continued into this post-release airow, a vowel is likely to be perceived. If
voicing is not continued but the airow is, a heavily aspirated release is a likely
percept. The fact that vowels were perceived by the transcribers more often
following Mollys (voiced) nasals than following her (voiceless) oral conso-
nants is not surprising. Furthermore, when vowels were perceived following
nal obstruents in Mollys speech, they were often transcribed as voiceless. This
is merely further evidence that her nal stops were indeed voiceless.
We embarked upon this study with the question of the nature and process of the
transition from whole-word to segmental phonology. It may be that production
patterns such as Mollys, which highlight segments in one word position only, are
a step in the direction of segmental phonology. Molly seems to experiment only
with variants of nal segments; there is no evidence of a change by 16 months
in Mollys organization or awareness of other parts of words. Such changes
may occur so subtly and gradually as to escape notice, or they may occur later in
her phonological development. If the latter is true, then the onset of segmental
phonology may occur gradually in some children, with segmentation emerging
rst in the shape of an articulatory routine which affects one particular word
position. A series of such specic articulatory routines would lead the child
eventually to a segment-based phonological system.
Although Molly is an unusual child phonologically in some ways, the process
which we have documented here provides strong conrmation of previous
claims about childrens development of articulatory routines (e.g., Menn
1979). Mollys volubility and unusual consistency, coupled with our use of
acoustic as well as perceptual analysis, have permitted more thorough quanti-
tative documentation than has previously been provided in the literature.
Although such changes in phonetic variability may be more difcult to identify
in other children, the search for such data should be motivated by the results
reported here. Further verication of such changes is important if we wish to
discover the process by which a child progresses from pre-systematic whole-
word based phonology to the orderly, segment-substitution-based phonology
described for older children.
notes
1. Notice that round and later hand were treated as identical in pattern to the nasal-nal
words (name, bang, down, and green). There is no evidence that Molly was distin-
guishing perceptually between nal nasals and a nal nasal + stop cluster (cf. Braines
[1976] remarks regarding the relative imperceptibility of the voiced stop in such
clusters).
2. It is interesting to note that addition of a nasal offset has been reported for three
children as a phonological strategy facilitating production of a nal voiced stop. Fey
and Gandours (1982) Lasan used a pattern involving voiced stop plus nal
nasal, while Damon Clark and Eva Bowerman both used a nasal with or without a
following voiceless stop as a way of expressing nal voiced stops (Clark and
Bowerman 1986).
References
Bates, E., Benigni, L., Bretherton, I., Camaioni, L., and Volterra, V. (1979). Cognition
and communication from 9 to 13 months: correlational ndings. In E. Bates (ed.),
The emergence of symbols: cognition and communication in infancy, pp. 69140.
Bleile, K. (1986). Regressions in the phonological development of two children. PhD
dissertation, University of Iowa.
Bowerman, M. (1982). Reorganizational processes in lexical and syntactic development.
In E. Wanner and L. R. Gleitman (eds.), Language acquisition: the state of the art,
Braine, M. D. S. (1976). Review of N.V. Smith, The acquisition of phonology: A case
study. Language, 52, 48998.
Bush, C. N., Edwards, M. L., Luckau, J., Stoel, C., Macken, M. A., and Peterson, J.
(1973). On specifying a system for transcribing consonants in child language: a
working paper with examples from American English and Mexican Spanish.
Stanford University: Department of Linguistics.
Clark, E.V. and Bowerman, M. (1986). On the acquisition of nal voiced stops. In
J. A. Fishman, A. Tabouret-Keller, M. Clyne, Bh. Krishnamurti, and M. Abdulaziz
(eds.), The Fergusonian impact: in honor of Charles A. Ferguson on the occasion of his
65th birthday, vol. I: Fromphonology to society, pp. 5168. Berlin: Mouton de Gruyter.
Fey, M. E. and Gandour, J. (1982). The pig dialogue: phonological systems in transition.
Grunwell, P. (1981). The development of phonology. First Language, 2, 16191.
4964.
(1986). Phonological development: Production. In P. Fletcher and M. Garman (eds.),
Language acquisition: studies in rst language development, 2nd edn., pp. 22339.
Jusczyk, P. W. (1986). Toward a model of the development of speech perception. In
J. S. Perkell and D. H. Klatt (eds.), Invariance and variability of speech processes,
pp. 119. Hillsdale, NJ: Lawrence Erlbaum Associates.
Leonard, L. B., Newhoff, M., and Mesalam, L. (1980). Individual differences in early
child phonology. Applied Psycholinguistics, 1, 730.
Leopold, W. F. (1947). Speech development of a bilingual child, vol. 2: Sound learning in
the rst two years. Evanston, IL: Northwestern University Press.
MacKain, K. S. (1988). Filling the gap between speech and language. In M. D. Smith and
J. L. Locke (eds.), The emergent lexicon: The childs development of a linguistic
vocabulary, pp. 5174. New York: Academic Press.
(1979). Developmental reorganization of phonology: A hierarchy of basic units of
acquisition. Lingua, 49, 119. Reprinted in this volume as Chapter 5.
(1980). The acquisition of the voicing contrast in English: a study of voice onset time
in word-initial stop consonants. Journal of Child Language, 7, 4174.
Macken, M. A. and Ferguson, C. A. (1981). Phonological universals of language acquis-
ition. In H. B. Winitz (ed.), Native and foreign language acquisition, pp. 11029.
New York: New York Academy of Sciences.
Maxwell, E. M. (1981). A study of misarticulation from a linguistic perspective. PhD
dissertation, Indiana University. (Reprinted by the Indiana University Linguistics
Club, 1982.)
Maxwell, E. M. and Weismer, G. (1982). The contribution of phonological, acoustic and
perceptual techniques to the characterization of a misarticulating childs voice
contrast for stops. Applied Psycholinguistics, 3, 2943.
McCune, L. and Vihman, M. M. (1987). Vocal motor schemes. Papers and Reports on
Child Language Development, 26, 229.
Menn, L. (1976). Pattern, control and contrast in beginning speech: a case study in the
development of word form and word function. PhD dissertation, University of
Illinois. (Reprinted by the Indiana University Linguistics Club, 1978.)
(1979). Towards a psychology of phonology: Child phonology as a rst step. Paper
Presented at the Conference on Applications of Linguistic Theory in the Human
Sciences, Michigan State University.
Menyuk, P., Menn, L., and Silber, R. (1986). Early strategies for the perception and
production of words and sounds. In P. Fletcher and M. Garman (eds.), Language
acquisition: studies in rst language development, 2nd edn., pp. 198222.
Oller, D. K. (1975). Simplication as the goal of phonological processes in child speech.
Language and Learning, 24, 299303.
Smith, N. V. (1973). The acquisition of phonology: a case study. Cambridge: Cambridge
University Press.
Stockman, I. J., Woods, D. R., and Tishman, A. (1971). Listener agreement on phonetic
segments in early infant vocalizations. Journal of Psycholinguistic Research, 10,
593617.
Stoel-Gammon, C. and Cooper, J. (1984). Patterns of early lexical and phonological
development. Journal of Child Language, 11, 24771.
Studdert-Kennedy, M. (1987). The phoneme as a perceptuomotor structure. In
A. Allport, D. MacKay, W. Prinz, and E. Scheerer (eds.), Language perception
and production, pp. 6784. London: Academic Press.
Velleman, P. F. and Hoaglin, D. C. (1981). Applications, basics and computing of
exploratory data analysis. Boston: Duxbury Press.
Vihman, M. M. (1976). From pre-speech to speech: on early phonology. Papers and
(1981). Phonology and the development of the lexicon: evidence from childrens
errors. Journal of Child Language, 8, 239264.
(1987). The interaction of production and perception in the transition to speech.
Presented at the Twelfth Annual Boston University Conference on Language
Development.
Vihman, M. M. and Greenlee, M. (1987). Individual differences in phonological devel-
opment: age one and age three. Journal of Speech and Hearing Research, 30,
50321.
Vihman, M. M. and Hochberg, J. (1986). Velars and nal consonants in early words. In
J. A. Fishman, A. Tabouret-Keller, M. Clyne, Bh. Krishnamurti, and M. Abdulaziz
(eds.), The Fergusonian impact: in honor of Charles A. Ferguson on the occasion of
his 65th birthday, vol. 1: From phonology to society, pp. 3749. Berlin: Mouton de
Gruyter.
Vihman, M. M., Macken, M. A., Miller R., Simmons, H., and Miller, J. (1985). From
babbling to speech: a reassessment of the continuity issue. Language, 61, 395443.
Vihman, M. M. and Miller, R. (1988). Words and babble at the threshold of language
(1978). Growth of complexity in phonological development. In N. Waterson and
C. E. Snow(eds.), The development of communication, pp. 41542. NewYork: Wiley.
Zlatin, M. and Koenigsknecht, R. (1975). Development of the voicing contrast: percep-
tion of stop consonants. Journal of Speech and Hearing Research, 18, 54153.
(1976). Development of the voicing contrast: a comparison of voice onset time in stop
perception and production. Journal of Speech and Hearing Research, 19, 7892.
Appendix
Acomplete list of transcribed child word types which either target a consonant-nal adult
word or are sometimes produced as a consonant-nal form, or both. (Onomatopoeic
words were generally excluded.) Types used for acoustic analysis are starred. (Note
that acoustic analyses cover ages 1;0.25 through 1;2.20.) C = consonant, V = vowel,
G = glide, I = imitated only.
Age Adult target Child token
0;10.15 cracker w
h
k, kk, kk
[also CV (CV) forms]
protoword: b, ,
hug sound [also CV forms]
0;11.9 cat (I) ka:, k
h
, k
h
, k
h
j,
kh (2)
0;11.20 block (I) a, b,
dog (I) d
1;0.10 baby pe:t, pe: p (3), pep (3)
[also CV, CVC forms]
ball pp, pep
good girl (I) kuk
h
k (2), kko:k
oops jp, p (2),
:p, p, o p, p(1)
peek pk
1;0.26 balloon (I) b.
h
*bang ba , ba
I
(2), da
I
bird b, pt, bt
h
*book bt, pk
h
*button (I) b (2)
choo-choo/toot-toot (I) tt
[also CVCV forms]
*cup (I) kk
h
, kk
*hot hat
h
(3)
*up hp
h
(3), hap
h
*box (I) bk
h
, pk
close (I) k, ko
*cold (I) kok

h
pumpkin (I) k, k (2), kk
*teeth tt
h
, tit
h
(4),
tt
h
(I), ti
h
, t ik
h
, tt
1;1.8 *bang b:
I
n, b, p:: (2),
p, b
i
ni (2),
b

: (I), p
(2),
pe, pk, be, b
n
boat (I) be
cat k
h
*clock(I) ka
k
k
h

dog (I) ta
h
*down (I) t
:., t
good girl kkkkk

*horse ht
*hot ha (3), hat (10),
ha
(2), ht, at
h
*peek (I) p
h
i, p, pe||k
x
e
squeak k
h
k
h
, kk (I)
kkkiki (i)
kke
k
kk
kkkik
h
ik
h
ik
kkkk
*teeth tit t
h
i

(2),
tit t
h
t
h
t
h
, dit (3),
ti
t
(2), dits,
ti
t
s.t i , tit (2)
tick-tock (I) ti t
h
1;1.15 *box (I) ba

burp pap
h
, papp (I),
pp (I)
*button (I) pa
n
n
*down t, d:n (I:2),
da::, eda
I
n (I),
dan
d
n
n, t
j
good girl ggk, gg, gg,

gg (2), gg (2),
kgg, kk
hat (I) h, h
h
*hot t
h
, ht, ht
h
(2)
ht , h
*round han (i), h
n
ni (I),
n
ni (i),
n
n
*up p, :p
1;2.20 apple (I) ap, hap (4), hap
,
ap (6,) hp,
p, hp (2),
hap (2), hapa,
ap, ha:p,
ap (3), hap,
hp (2), hp (2)
hap (7), hap
hp (14), hap (22)
hp, u
*around (1) wa:n
baby pipi (5), pebi (10)
pib, paba, pipi (5)
*block (I) pat (5)
*book (I) pk, p
h
ik, pk (2)
*Brian pan, pani, pa:ni (2),
pa
*building (I) pa:ni
*bus (I) pt, pat(4), pa:t
*button (I) pa:a, pa
n
n
cheese ait
, it
(3), hit
*clock (I) kak (2), kk(2)
*coat kk (24), kk (4),
gk, kk (5), ko:k
kk (3)
[also CV forms]
*down tn (2), tn: (3),
tm:, t

n
t (2), ta
*eat i:t
(i), it(i),
it
(3), hit
*foot pt, ht, to, tot,
tot, tt, tt, at,
tat, tot (4)
*hand (I) han (3)
Hooper h
p
pe (3), h
p
p
b, b, p
pt (2), hpo
hp (2), p
hp (3)
nose (I) no:, n: (2)
*stuck tk, tt
h
(19)
tt, tat
h
*tock (I) k
h
t, kaki , t
h
iki,
t
h
a ki,
h
ki
toot toot t
t
t
, tt tt, tt
h
tt
h
*up p
,
:p (2), ::b
watch (I) wa
t
, wat
9 How abstract is child phonology?
Towards an integration of linguistic and psychological
approaches
Marilyn M. Vihman, Shelley L. Velleman, and
Lorraine McCune
Our goal in this chapter is to explore the emergence of phonological systema-
ticity within a psychological framework. We begin by reviewing earlier work
which traces the initial phonological system back to its origins in babble and
proposes a model of the interaction of perception and production in emergent
vocal organization. Our account of the origins of system attempts to suggest
answers to the question: how can an initial system be constructed? That is,
how does the child move from the production of unrelated vocal forms
(sometimes known as item-based phonology; see Menn 1983; Waterson
1971) to an idiosyncratic holistic system (word-based phonology; see
Ferguson and Farwell 1975)? We will consider a number of issues con-
cerning representation, focusing on these questions: can the categorical
change in language use, from contextually embedded word production to
symbolic reference, be related to some underlying qualitative change in
mental representation? How does the linguistic notion of internal represen-
tation relate to the psychological notion of mental representation? Finally, we
explore the issues that arise in attempting to model the onset of phonological
systematicity, such as: when are we justied in imputing a phonological
internal representation to the child? That is, what is the evidence from the
childs observable behavior (word production shapes) that a formal system of
interrelated representations has begun to cohere? How much structure should
be specied in such internal representations? Or what counts as sufcient
evidence for positing contrasting levels or units in the childs emerging
system? And what is the status of extra-systemic elements, either in the
early period, when a small repertoire of vocal production patterns are used
in response to specic familiar eliciting situations, or in the later period, when
a phonological system appears to underlie the childs productions?
To develop a sufciently general basis for examining these issues we present
microanalyses of the early phonological development of two children who
differ in overall strategy as well as in units of organization and in the articulatory
basis for their rst word productions. This allows us to illustrate some of the
ways in which individual children follow distinct paths in phonological and
259
lexical development, and also to place linguistic advances within a larger
psychological framework.
Origins of system
The rst adultlike syllable production (canonical or reduplicative babbling)
emerges in normal infants within a narrow temporal frame (610 months) and
evinces strong neuromotor constraints. A small consonantal repertoire usually
is reported, reecting simple ballistic movements (stops and nasals account for
most true-consonant-like productions); the syllable nucleus is restricted largely
to low- to mid-central or front vowels, resulting from relatively wide jaw
opening with neutral tongue placement (Davis and MacNeilage 1990; Kent
1992; MacNeilage and Davis 1990). By 10 months individual differences in
production are apparent as infants explore their vocal resources, developing
vocal motor schemes (McCune and Vihman 1987), or preferred production
patterns, which reect both sensitivity to adult language phonetic tendencies
and emergent vocal control.
As the child develops articulatory control and familiarity, through self-
monitoring, with the sound as well as the feel of well-practiced phonetic
gestures, some routinely used sound patterns of the adult language become
perceptually salient through their resemblance to the childs own often repeated
vocal motor schemes. When the child reproduces such vocal patterns
in situationally appropriate contexts, caretakers may identify them as rst
words. We interpret such early words as the products of a tight developmental
interaction. They reect an interindividual construction process based on the
childs evolving vocal capacities and parental attunement to child vocal pro-
duction and to the focus of child attention. The childs vocal capacities them-
selves globally echo dominant patterns of the ambient language and, in turn,
serve as a lter for the childs more detailed (production-driven) auditory
processing of that language.
Figure 9.1 (adapted from Vihman 1993a) displays the model of the inter-
action between perception and production which we assume gives rise to the
rst phonological system. A certain number of adult words are made salient by
virtue of prosodic heightening (the combination of pitch change, increased
amplitude, and increased duration which enters into word or phrasal accenting
in most languages and is usually emphasized further in caretaker talk; see
Ferguson 1964; Fernald 1984 1991; Garnica 1977), frequent occurrence in
isolation or in sentence-nal position (Aslin 1993; Goldeld 1993), and the
inherent interest of the situation of use to a particular child (Lewis 1936;
Ferguson 1978). These words are taken to provide the child, over the rst
several months of life, with an aural impression of ambient speech patterns.
Attention to the sound patterns made prominent by these three factors can be
assumed to play an important role in channeling the childs prelinguistic vocal-
izations toward the phonetic characteristics of the ambient language; these
260 Marilyn M. Vihman, Shelley L. Velleman, and Lorraine McCune
salient words and phrases must make up the global auditory impression which
is reected in the babbling of infants on the threshold of speech (Boysson-
Bardies, Hall, Sagart, and Durand 1989; Boysson-Bardies and Vihman 1991).
Visual effects also play a role in shaping the childs prelinguistic vocalizations.
For example, the visual image of jaw opening and closing is a likely component
in the sudden emergence of the rst canonical syllable production, which is
sometimes observed to occur silently before it is accompanied by vocalization
(Roug, Landberg, and Lundberg 1989). Similarly, the characteristic facial set
of adult caretakers could explain early ambient language effects on the use of
vowel space (Boysson-Bardies et al. 1989).
1
Finally, the predominance of
Visual effects
[a]
[jIji] [jeIdji],[jIji] [taIdi]
(lady, daddy)
(14 months)
[babi]
(bottle)
(16 + months)
[i] [ ] [ ] [ ]
C V C V C C V
Word production pattern
('canonical form')
Word
[labial]
Word
[palatal]
(9 months)
(16 + months)
(9 10 months)
(8 months)
[jajaja]
[dji]
[da:ejan]
[kijita:ji]
[a:ji]
[B]
[?pa
h
]
: syllable
() ()
: mora

[ ]: specified features
Timmy
ball Ais /peIz/
Alice
baby
baby
bracelet
balloon
good boy
Bonny
dolly
Ernie
hi
block
bell
(prosodic effects, frequency,
inherent interest of situation of use)
Auditory effects: salient adult words
Salient adult words
(as above + match to VMS)
PERCEPTION PRODUCTION
PERCEPTUO-MOTOR LINK
(ARTICULATORY FILTER)
(Exploration through babbling)
Vocal Motor Scheme
[p
h
a]
[?
a
fa]
[h u
v
v]
Figure 9.1. Model of the interaction of perception and production
How abstract is child phonology? 261
labials in the early words of sighted children (Locke 1983; Vihman Macken,
Miller, Simmons, and Miller 1985), especially the hearing-impaired (Stoel-
Gammon and Otomo 1986) but apparently not the blind (Mulford 1988),
may be ascribed to the facilitative effect of the visual cue afforded by lip
closure.
2
These global ambient language inuences on babbling are expressed in
Figure 9.1 by the dotted line linking auditory and visual effects and vocal
exploration; the interaction of the two may be taken to guide the construction
of individual vocal motor schemes, consistent motor acts performed inten-
tionally and . . . capable of variation and combination to form larger units
which evolve in the course of babbling (McCune and Vihman 1987: 72). The
vocal motor schemes are different for each child, regardless of ambient lan-
guage, but nevertheless are shaped by that language.
The placement of a perceptuo-motor link at the center of Figure 9.1 expresses
the view that many of the characteristics of the childs earliest words which have
been established over the past two decades their relative accuracy along with
their apparent selectivity with regard to adult models (Ferguson and Farwell
1975), their lack of interrelationship or piecemeal quality (Macken and
Ferguson 1983) are most readily understood if we assume that once a child
has begun to repeat a fewvocal patterns with some regularity or apparently at will,
that is, once some vocal motor schemes have developed, these patterns add to the
salience of certain adult words that are, besides, prosodically highlighted,
frequent, and inherently interesting to the child.
More specically, adult words that (more or less) match some pattern which
the child has come to produce with facility eventually will be attempted by the
child, in appropriate (remembered) context. This can be taken to be the charac-
teristic route by which the rst words are uttered by children and identied by
caretakers, often before the child has progressed cognitively to the point of
making adultlike general or symbolic reference to classes of objects and events
(Bates, Benigni, Bretherton, Camaioni and Volterra 1979; Vihman and
McCune 1994). That is, when a familiar situation arises in which a particular
word or phrase allgone, byebye, duckie, no tends to be repeatedly expressed
by adults, if one of those words also happens to be close enough to a vocal
pattern the child has come to know through self-monitoring and can now make
at will, the child is likely to be reminded of that matching pattern (Rovee-
Collier, Sullivan, Enright, Lucas and Fagen 1980), resulting in what adults
identify as rst (context-limited) word production.
The arrow at the bottom of Figure 9.1 represents the route from phonetically
salient adult words, in combination with the development of one or more vocal
motor schemes, to word production patterns, or production formulae that allow
the child to make rapid lexical progress by simplifying the number of options
available when a word is uttered (Kiparsky and Menn 1977; Menn 1983). It is
this last step with which we will primarily be concerned here, a step that
constitutes the bridge from phonetics into phonology.
Representation
The term representation is widely and somewhat ambiguously used in scien-
tic elds, including linguistics and psychology. To clarify the issues raised in
this chapter, in which the term representation is applied in several senses, we
rst will address controversy regarding mental representation in the eld of
psychology and introduce evidence for relations between mental representation
and language. We then consider the systematically ambiguous meaning of
internal representation, as used in the eld of linguistics, to prevent misun-
derstanding of the comparable usage here.
Psychological views of mental representation
Two basic positions regarding mental representation in the infant are current in
psychology today: (1) it is present from birth and demonstrable early in life,
with subsequent changes reecting maturation of innate capacities (Leslie
1987); and (2) it develops over the rst two years of life as a consequence of
the organization of motor and behavioral actions in relation to the developing
central nervous system, with onset following the rst birthday and roughly
corresponding to signicant developments in language (McCune-Nicolich
1981b). Investigations of the behavioral expression of representation from
these disparate viewpoints show little overlap in the tasks used or the ages of
the subjects tested. For example, those who espouse the innatist view study
infants in the early months of life, using differential looking time or the infants
tendency to continue the trajectory of visual following when an object disap-
pears from view (Baillargeon 1987). Studies deriving from the developmental
position begin after 6 months of age and utilize motor responses to phenomena
of absence, such as object search (Ramsay and Campos 1978; Uzgiris and Hunt
1975) or representational play (McCune 1995; Nicolich 1977).
Sartres (1966) contrast of perceptual versus imaginal or representational
experience suggested that the early phenomena are best understood as per-
ceptual processing of present reality, which may include memory
of the immediate past and expectancy regarding the immediate future.
According to Sartre, perceptual processing draws continually from the sensory
present, in which the contents of consciousness can be specied in relation to
phenomena observable in the environment. In contrast, imaginal (or mental)
representation may use a perceived event as a starting point (e.g., a portrait of a
friend), but the resulting experience is an instance of pure consciousness, an
internal contentful state that is not directed at perceived reality.
If we take perceptual processing to be infants original tendency, the devel-
opmental course of the ability to relate the present to the absent and the past to
the present provides an index to the emergence of a capacity for mental
representation. Memory research spanning the age range of 8 weeks to 6 months
by Rovee-Collier and her colleagues (e.g., Rovee-Collier et al. 1980) has
provided the strongest indicator of both the strengths and the limitations of
infants capacity to retrieve past experience. In these studies, infants learn to set
a mobile in motion by kicking the leg attached to the mobile by a ribbon. After
several training sessions it is possible to test both immediate and long-term
retention (up to weeks and months). The results clearly indicate that even
8-week-olds are capable of retention. However, the degree of dependence
of this effect on exact replication of the training context, the limitations on the
length of the retention period, and the relative power of cuing to reinstate the
memories each show strong developmental trends. Furthermore, the behavioral
expression of memory in this task is itself a motor response (foot kick) that
occurs in the context of perceptual recognition. These studies demonstrate the
strengths of perceptual motor processes for learning and memory. The fact that
at 6 months of age infants succeed at this paradigm but fail to search for and
retrieve small hidden objects suggests that a qualitatively different type of
processing may characterize memory that depends upon contextual reinstate-
ment (here termed perceptually dependent memory) as opposed to
memory that is able to function in the presence of confusing contextual cues
(here termed mental representation). We use the term mental representation to
refer to a contentful mental state distinguished fromperception by its capacity to
reference absent and past realities.
Mental representation and language
Piaget (1962) suggested that the infants ability to retrieve an object placed in
the experimenters hand or a container and then released beneath a cloth outside
the infants view indicates an initial capacity for mental representation (Stage 6
of Object Permanence). Given the nature of language as symbolic or repre-
sentational, it was at rst assumed that development in understanding of object
permanence would correlate with language development. In fact, children in
the early stages of language acquisition are capable of solving the Stage 6 task.
However, the number of object permanence test items passed in Stage 6 shows
no continued correlation with advances in language (Bates et al. 1979;
McCune-Nicolich 1981). The lack of correlation can be attributed to the fact
that entry into Stage 6 constitutes a culminating milestone in the concept of
object permanence but marks only the onset of language use.
Representational play, which begins only toward the end of the sensorimotor
period, does show reliable relationships with later language milestones (Bates
et al. 1979; McCune 1995; McCune-Nicolich and Bruskin 1981). The transition
to representational play, in which the child rst demonstrates knowledge of the
function of small replicas (toy cups and saucers, tiny cars and trucks) and later
indicates awareness of the pretend nature of such acts by vocal elaborations
and coy smiles, corresponds to the production of early nonreferential (context-
limited) words in precocious talkers, whereas referential words are likely to be
noted at the transition to play in which two or more acts are combined. We
interpret the temporal correspondence between this combinatorial play and
referential language use as following from the more differentiated character of
mental representation, which allows an event to be portrayed with a variety of
gestures and objects and vocal forms to be produced outside of their original
context in relation to a variety of new situations. For example, doggie, learned
with reference to the family pet, is now produced in relation to the neighbors
dog and to pictures of dogs as well. It should be noted that close play-language
correspondences characterize early talkers, whereas studies have indicated
that later talkers may show the play milestone well before the corresponding
language is observed. McCune (1992) demonstrated that lack of available vocal
motor schemes accounted for the time lags between representational and lan-
guage milestones for some subjects.
Internal representation in phonology
In child phonology the term internal representation is intended to characterize
underlying aspects of the childs understanding and production of speech.
Diagrammatic descriptive models typically are used to characterize the structure
imputed to the childs system. The childs internal representation and the
linguists description thereof are sometimes assumed to be isomorphic. When
the linguist claims psychological reality for internal representation, that reality
can best be considered roughly equivalent to the psychologists term mental
representation, which is a form of mental processing, or a contentful state of
the organism. The linguists model attempts to describe the organization or
complexity of the relations among elements imputed to the childs system as
evidenced by systematic relations among the utterances produced. The model
is, therefore, a characterization of information about the childs system as we
know it, whereas the childs internal representation is a system of unknown
parameters capable of generating the utterances appropriately described by the
model (see Van Gulick 1982).
Internal representation generally is taken to refer to a form of mental storage
(e.g., Locke 1988; Menn and Matthei 1992). For example, Locke (1988) argued
that we cannot know whether a phonology is needed until it is determined that a
discrepancy exists between stored and produced patterns. Nor can we propose
explicit phonological rules until we have inferred the phonetic structure of internal
representations (p. 4). Developmental approaches to internal representation are
rare (but see Velleman 1992; Waterson 1981); to our knowledge, there has been
no previous effort to explicitly relate the linguists internal representation to the
psychologists mental representation.
In our view, the production of context-limited words, which have been
observed to occur prior to the onset of combinatorial representational play
(McCune 1992), is in some ways comparable to the perceptually dependent
memories of the Rovee-Collier experiments. Aspecic familiar event evokes an
intentional state and associated vocalizations (Bloom 1991). There is minimal
differentiation among such components as speaker, hearer, physical context,
and vocal motor action (Werner and Kaplan 1963). Such production requires the
availability of one or more vocal motor schemes, but not a word production
pattern generalized across a variety of word types.
Somewhat later, a newcapacity for mental representation allows the elements
of the speech situation to be differentiated, yet integrated. An unfamiliar event
might now evoke the word associated with a related event. A given event might
be referenced by one of several words. The child experiences an increasing
range of potential representational meanings, whereas vocal motor skill may
remain limited. Vocal expression thus comes to rely on the word production
patterns which now evolve, made possible by the increased capacity for
separating word form from situation of use and for the internal experience of
relationships between linguistic elements, which facilitates the juxtaposition of
one or more vocal motor schemes and a range of phonetically related adult
words.
Storage need not be postulated at either of these developmental time points.
Whereas in the earlier period a familiar context may provide sufcient percep-
tual support to elicit instantiation of the one vocal motor scheme associated with
the situation, in the later period instantiation of a variety of situationally
appropriate adult-based forms is possible, given adequate articulatory capacity.
Furthermore, once the childs vocal forms are no longer embedded in a partic-
ular situation of use, they can be compared or superposed (as in connectionist
models, such as Stemberger 1992) as a basis for the development of a general-
ized word production pattern. We believe that this is the basis for the beginnings
of phonological systematization, which we nd to emerge at about the same
time as the rst referential or generalized use of words. At this point, something
more abstract than a vocal motor scheme operating in combination with per-
ceptual attention to particular auditory patterns has materialized. This is a
phonological mental representation which may become instantiated when refer-
ence to the corresponding object or event is contemplated.
Modeling the childs system
There is a wealth of persuasive evidence regarding the importance of the word
and syllable levels in child phonology, particularly in the early period (Chiat
1979; Ferguson and Farwell 1975; Kent and Bauer 1985; Macken 1979;
Vihman 1992). A hierarchical model of phonological structure is needed to
capture this important aspect of childrens systems. It appears to us that the most
viable option currently available is nonlinear phonology.
Several phonologists have suggested nonlinear models of early child words,
especially for the frequently observed patterns of harmony and reduplication.
The task has been approached in a variety of theoretical frameworks, including
prosodic (Waterson 1971), parametrical (Fikkert 1991; Lle 1992), connectionist
(Berg 1992; Menn and Matthei 1992), and cognitivist (Menn 1978; Velleman
1992). All of these approaches seek to account for the fact that the childs
phonology is simpler than the adults, at least at the output level. However,
depending on the child and/or the model, simple may have many different
meanings; these will be reviewed briey.
A child system may be simpler in its hierarchical structure, lacking whole
levels of representation (e.g., the skeletal or segmental tier). Menn and Matthei
(1992), for example, suggested within a connectionist framework that priming-
type interactions among similar articulatory patterns (words) may induce the
beginnings of autosegmental structure potentially . . . without segmentation
below the word level (p. 243). Velleman (1992) suggested that the phonological
representations of children with highly restrictive, often babble-based word
recipes may have lexical representations with almost no structure at all (e.g.,
word-level representation only), relying on their existing articulatory patterns to
provide whatever redundant phonetic detail is required to esh out productions.
3
The childs representation also may be simpler within a given level. For
example, early syllable constituents may be nonbranching (Fee 1991; Fikkert
1991; Lle 1992; Ohala 1991; Velleman 1992). That is, only CV syllables may
be possible at rst. If no branches are available at the word level (i.e., if all words
are monosyllabic), then word and syllable are synonymous and no lexical
distinction need be made.
The childs system may be less integrated, with consonant and vowel effects
occurring independently due to planar segregation (Fikkert 1991; Lle 1992;
McDonough and Myers 1991; Macken 1993; Velleman 1992), in which con-
sonants and vowels occur on separate phonological tiers. Such segregation was
originally proposed for Semitic languages in which morphological templates
require either consonants or vowels, for other types of templatic morphology in
which linear order of consonants and vowels is redundant, and for languages
with very simple CV phonotactic structures (Lle 1992; McCarthy 1989).
Planar segregation in child phonology allows vowels to be transparent to
consonant harmony, and vice versa, and accounts for the increased frequency
of such harmony processes in early phonologies. It also provides a model of
C/V metathesis that is consistent with principles of adult phonology: if conso-
nants and vowels are on separate tiers, then they may appear to switch places
without violating constraints against crossing association lines.
Childrens lexical feature specications may be minimal as well. Such
underspecication is identied in child phonologies when elements of surface
form are completely predictable. This may stem from pervasive harmony
patterns, in which the degree of feature spreading is so great that the positions
receiving harmony are thought to be vulnerable to spreading due to the lack of
any feature specication of their own. For example, if a child demonstrates
regressive and progressive harmony affecting target coronals (or dento-
alveolars) whenever a labial or dorsal (velar) consonant occurs anywhere in
the word, then we assume that [coronal] is not underlyingly specied.
Redundancy, or predictability in surface features, also may stem from phonetic
or phonotactic restrictions. For example, if all consonants in a childs systemare
stops, then [continuant] is predictable and need not be lexically marked.
Features also may be specied but lexically unordered where their order is
predictable. This lack of lexical ordering is manifested in apparent C/C or V/V
metathesis in children (e.g., Virves productions of [asi] for /isa/ father and
[am-i] for /ma/ mother, described in Vihman 1976) and in redundantly
ordered complex clusters in some adult phonologies. (See Velleman 1992 for
further discussion.)
Sometimes an unspecied feature also will serve as a default feature value, to
be lled in at the surface level whenever the corresponding C or V slot remains
otherwise unspecied (Fikkert 1991: Lle 1992; Velleman 1992). Although
coronal has been proposed by some as both a default and a lexically unspecied
feature for adult languages (Stemberger and Stoel-Gammon 1991), a childs
unspecied features need not necessarily be defaults (Lle 1992). Either spread-
ing of a harmonic feature or the lack of any surface realization (omission) may
be the fate of such unspecied elements. Although some features may not need
specication, feature specications that are necessary may encompass a broader
domain than in adult phonology, applying to an entire mora, syllable, word, or
even phrase (Iverson and Wheeler 1987; Velleman 1992). The eventual trickle-
down of such features to the segmental level has been referred to as deauto-
segmentalization (Goldsmith 1979; Spencer 1986). Similarly, rules or processes
may showa greater breadth of application. For example, spreading of a particular
feature may affect all possible recipient segments in either direction (right or
left) over a large domain, such as an entire phrase (Lle 1992).
Whether these options are available to all children, specied by innate param-
eters, determined by some characteristics of the language to which the child is
exposed, chosen by the child based on idiosyncratic perceptual, physiological,
or cognitive biases, or some combination of the above is an open and widely
debated question. In any case, the course of phonological development includes
the addition of complexity to any or all of these aspects of the representation.
We prefer to attribute the minimal possible structure and the fewest possible
rules to the child at any given point in development and will attempt to
demonstrate that a nonlinear model can be constructed to account for devel-
opmental increments of phonological complexity, attributing complexity to
representations rather than rules and adding rather than changing structure
over time. Given our assumptions about the origins of early words in production
and perception and about the relation of emergent phonological systems to the
childs evolving representational capacity, we see no need to posit specically
linguistic innate structures (contra, e.g., Macken 1992).
Identifying the onset of system
Although the emergence of phonological systemis clear, even dramatic, in most
of the children we have observed, bits of the system typically are already
apparent (at least in retrospect) before they come together sufciently to lead to
a sudden ourishing of diverse lexical items. The system itself coheres gradu-
ally, over time, but when a critical point is reached (either cognitively or
phonologically; it may be impossible to decide which is determinative), the
system seems suddenly to have power enough to strongly affect lexical choice
and production and to assimilate adult words that do not provide an obvious t
with the childs template. Close analysis of the phonological progress of two
children, reported below, will demonstrate that the onset of system can be
recognized not only in the interesting cases of distortion of adult models
(regression in accuracy) in child word production, but also in the spurt in
acquisition of words that t the template.
Extra-systemic elements
Adult phonological systems include marginal elements, especially in words
which are salient because they are exotic (ZsaZsa), humorous (schmaltzy), chic
(au jus, karaoke), or newsworthy (dtente, Sri Lanka, Schwartzkopf). Similarly,
the childs production may include a small set of extra-systemic words, recog-
nizable by their inconsistency with the majority of the childs forms. Children
sometimes produce surprisingly accurate renditions of difcult words before a
system has coalesced (e.g., Hildegard Leopolds famous production of pretty).
Such progressive idioms may be regularized when the childs phonological
system has become established, or they may persist as extra-systemic elements.
Later, extra-systemic words may reect aspects of the adult shape as perceived
by the child as well as aspects of the childs existing template, and so include
both systemic and extra-systemic elements.
Words that are partially or wholly extra-systemic can be expected to be
shorter-lived than other forms. The childs system, by denition, is more
consistent and persistent than other aspects of production and tends to dominate
lexical production and to dictate selection, once it is in place. However, extra-
systemic items may serve as precursors or even triggers for change in the childs
system; the system may accommodate to them some time after they rst appear
as marginal elements.
Two phonological proles
In a paper that focused on syllable production, Vihman (1992) presented
sketchy proles of the initial steps in lexical development of two children as
well as the syllables they practiced at 911 months. One of these children
seemed to base his early phonology on the syllable. The other child seemed
instead to operate with a phonetic gesture involving tongue fronting and raising,
or palatal articulation; the syllable did not play an important role for her. In this
chapter we consider the phonetic and phonological development of the same
children in ner detail, attempting to trace the interaction of perceptual biases,
vocal motor schemes, and representational capacity in the formation of phono-
logical systems.
Timmy: the syllable pattern
Timmy provides an example of a child who progressed phonetically rather
slowly and apparently effortfully. His words and babble forms were unusually
difcult to distinguish for several months (Vihman et al. 1985). His rst six
months of word production were based largely on phonetic variants of a single
syllable shape, <Ca> with a gradual increase in the consonantal choices avail-
able. Nevertheless, it is possible to distinguish an early, presystematic period
(913 months) and a later, system-based period (from 14 or 15 months on).
At 9 months Timmy already responds with <ba> to adult monosyllabic /b/
words (ball, block); by 10 months he produces <ba> spontaneously in situations
associated with those words (at 10 and 11 months <ba> is also produced in
imitation of basket, bell, boat, book, button and spontaneously for box; by 15
months bird, brush, bunny, baa(-baa) are produced as <ba>).
4
From 11 months
on Timmy responds to /k/ words (kitty, quack-quack, car, duck, key) with <ka
(ka)>.
5
The word-length distinction (between monosyllabic /b/ words and
disyllabic /k/ words) derives from the models, but is maintained somewhat
inconsistently, particularly after the rst month of use for each word.
6
There is
little evidence of a phonological system operating here. Instead, Timmy draws
on one of the articulatorily simplest syllables, [ba] (Davis and MacNeilage
1990; Vihman 1992), when he is reminded to produce his matching vocal
motor scheme by situations in which a familiar auditory pattern is commonly
produced by adults (in relation to some of his favorite toys, balls, blocks, bells).
Similarly, he produces his second vocal motor scheme, <ka(ka)>, in situations
associated with stop-initial word forms other than /b/ (car, kitty, Teddy, later also
a deictic form which begins as a response to Great Gable, referring to a
frequently identied drawing of a mountain; Vihman and Miller 1988).
It is only at 14 months that we see the extension of this pattern, rst to a single
word, eye, assimilated to Timmys pattern as [ja], then (at 15 months) to words
that elicit a range of different consonants: [a] for words characterized by
labiality and continuant friction, rst Ruth, with its rounded initial approximant
and nal fricative, later re, ies, owers, and plum [cf. Waterson 1971];
7
[j
-
a]
for light, where the palatal place appears to derive from the nuclear diphthong
while the stop articulation derives from the nal consonant; [na] for nose and
later Nana; and [ja] for ear, hair as well as eye these latter perhaps best
glossed, together with the probable phonological model eye, as response to
questions about my body.
At 15 months, furthermore, Timmy for the rst time produces two forms
outside his vocal motor scheme, both involving special sounds or sound effects
in the adult models: hiss is reproduced as [s
], while the word moo, produced by

adults with a long, low-pitched vowel, is reproduced as [:] and the related
form [m
m] is used to imitate the phonetically similar words moon and

mushrooms. A week later moo and moon have both been incorporated into
Timmys system as <ma> (with phonetic variants such as [m :m];
extra-systemic vowel length, which derives from adult modeling of moooo, is
incorporated into variants of both words).
Let us assume that an internal representation begins to take shape at 14 months,
when Timmy rst extends the two related vocal motor schemes, <ba> and <ka
(ka)>, to a third word type, <ja>. Until now vocal production in appropriate
situational context has involved a choice of two vocal motor schemes, labial and
not labial (see Figure 9.2 and Table 9.1). Now a palatal choice is added to the
repertoire of lexical possibilities, but there is little else that is not redundant in
Timmys forms. The vowel [a] is predictable as the only consistent vowel; the
variations in production are wholly unsystematic. Labial is Timmys default
consonantal feature value and thus need not be represented lexically. The word is
still equivalent to the syllable and the syllable to the sequence C+ a. Thus, there is
only one autosegmental level (W, the prosodic word) and only two lexically
represented feature geometry options. Only one consonant type may occur within
a word, so we assume that consonant features mark the entire word, not the
individual consonants. Indeed, there is no evidence that individual consonants
play a role in Timmys phonology at this point.
Because of the extreme simplicity of Timmys system, we have no need to
posit phonological rules such as spreading to account for the harmony in his
productions. Similarly, the issue of planar segregation is moot; there is no
possible interference between vowel and consonant tiers because there is no
need for a vowel tier. In addition, because the adult model provides information
about iteration, which occurs as an expression of attention (affecting new words
only), we assume that it is not represented lexically. In short, Timmy has added
one more syllable to his repertoire, suggesting emergent systematicity (see
Table 9.1). But his vocal motor schemes, together with the information about
number of syllables provided by the adult model, continue to sufce to account
for almost everything about his word production.
With the expansion of available word (or syllable) shapes at 15 months,
Timmy adds manner specications to his representation, further elaborating
feature geometry, since the rst contrasts at a given place of articulation now
emerge: stop-initial <ba> contrasts with fricative-initial <a> and glide-initial
<ja> contrasts with stop-initial <j
-
a>. Within the same month, a stop:nasal con-
trast also appears, as moo and moon enter the system as <ma> (as evidenced by
the loss of accuracy in vowel production). Consonantal feature geometry is the
only lexical element exhibiting change; the rest of the system remains as before.
Nevertheless, the multiple lexical expression of both place and manner contrast
seems sufcient to suggest that a phonological system is now minimally
established.
At 16 months Timmys word production reects a number of systematic
advances as well as some continuing extra-systemic experimentation. Because
syllables within a word may now contrast, the representation must include a
separate syllable level. Iteration itself has now entered the system as a separate
lexical possibility: <ba> block/peg, boat contrasts with <baba> baby, bracelet.
This is further evidence of lexical status for the syllable. In addition, a vowel
contrast is now available, although the new vowel, [i], occurs in only one
monosyllable ([di] the letter D) and as second vowel in a sequence
<a . . . i>, where it contrasts with the iterated sequence <a . . . a> (<baba>
15 months
16 months (a)
{C}
C
()
()
(C)
V V
([V place])
([V place])
([V place])
([V place])
{(C)}
{V} {V}
16 months (b)
( ): optional elements
{ }: emerging elements
: syllable
14 months
<ba, ka, ja>
<ba, ka, ja, na, a, ja>
Word
[place, manner]
Word
[C place, manner]
Word
[C place, manner]
Word
place of
articulation
Figure 9.2. Development of a lexical representation
Table 9.1 Development of a childs lexical representation (Timmy)
Portions provided by
Motor control Perception
Lexicon (phonological representation)
Age (in months) Word forms Vocal motor scheme Adult Model Feature geometry Autosegments Skeleton
10 <ba> Ca syllable
1113 <ba>, <ka> [a], +/ labial
open syllables
iteration
14 <ba>, <ka>, <ja> [a], labial default
open syllables
iteration [3-way place contrast] {W}
15a <ba>, <ka>, <ja>
<na>, <a>, <
j a>
[a], labial default
open syllables
iteration [u], [s
] C place and manner W

15b as above + <ma> [a], labial default
open syllables
iteration C place and manner W
16a C
1
V
1
(C
2
)(V
2
):
C
1
= above + [t, s]
C
2
= C
1
or labial
V = [a, i]
[a] default
open syllables
C order if C
2
C
1
Vorder if V
1
V
2
: [a]1st
[u] C place and manner:
2 features, unordered
or 1 feature per word
[a] unspecied
{[i] specied at segmental
level}
W, {CV(C)(V)}
16b [a, i, u]
Vowel sequence no
longer predictable
[a] default
open syllables
C order if C
2
C
1
C place and manner:
2 features, unordered
or 1 feature per word
[a] unspecied
[i, u] specied
at segmental level
W, CV(C)(V)
Notes: {} = emergent element/level; W = word (prosodic word); syllable; a = earlier in month; b = later in month.
baby, bracelet vs. disyllabic <bai> balloon, boy, please; <kai> car; <kaki>
cookie; and the restructured <ai> eye as well as <aija> hiya). Since default [a] is
no longer the only vowel and its distribution is not completely predictable, the
lexical representation must now explicitly include [i]. This is the rst sign of
emerging skeletal (C-V) and segmental levels of representation.
Until now, the formal problem of representing consonantal harmony across
intervening vowels did not arise, as no lexical vowel specication was neces-
sary ([a] was redundant). Now this issue comes to the fore: if vowels and
consonants share feature specications, as proposed by some phonologists
(e.g., Sagey 1986), then we have to explain why, for example, the feature
[dorsal], which we have placed at the word level and which is meant to trickle
down and create harmonic [k]s in <kaki>, does not also affect the two vowels
in this word. There are three plausible solutions: (1) assume that consonants and
vowels do not share feature specications; (2) assume that consonant features
are now also represented at the segmental level and that the high frequency of
harmony in Timmys system is a remnant of his previous word-level consonant
feature specication; or (3) propose that Timmys consonants and vowels are on
separate tiers (planar segregation), an interpretation that is compatible with
his simple word shapes but one to which we had no need to appeal previously.
Without taking a stand on the issue of vowel versus consonant features, we
assume the last of these options, as it attributes the least possible lexical
structure to Timmys phonological system.
8
Similarly, the possibility of paradigmatic consonantal contrast in place has
now been extended to allow syntagmatic contrast within a word: <nama>
Simon, <gaba> goodbye. Again sequence is predictable: [labial] always oc-
cupies second position in noniterative words. Because the consonants within a
disyllable are always either identical or ordered in a predictable way, we assume
that consonant features remain autosegmental. Labial can no longer be specied
as a default, but must be represented in its own right. The word is marked either
for one feature (or set of features) which, in production, spreads to all closants or
for two features (or sets of features) which are lexically unordered. Output rules
will specify the ordering.
The place feature [coronal], rst introduced in nasal <n> at 15 months, now
occurs in stop <t> followed by either <a> (daddy <tata>) or (D <di>). The
feature also occurs in the fricative-initial syllable <sa>, used to assimilate the
word sh to Timmys system. Thus, previously extra-systemic [s] now has been
incorporated. The vowel [u] remains extra-systemic, occurring only in a variant
of onomatopoeic toot.
Three unusually difcult words show continued phonetic (extra-systemic)
exploration: helicopter is produced as <ga(li)ga>: [gaga gliga
gliglig]; attempts at tape-recorder show the same range of variation. The
word light, represented as <
j a> at 15 months, now reects Timmys new

attention to the initial /l/. It can be represented as <aija>, like hiya, but shows
considerably more variation, the medial consonant being produced as a voiced
or voiceless palatal glide, voiced or voiceless palatal fricative, voiced alveolar
fricative, or even voiced dorsal fricative or sequence dorsal + [l]. Finally, we
represent the name Simon as <nama>, but the variation here is also consider-
able, suggesting some child attention to the discrepancy between adult model
and his own word shape. The vowels range over front and back low variants as
well as palatal on- and offglides ([ni
m], [ni
m n m ni
me]) and the

consonantal sequences [m . . . m] and [m . . . n] occur alongside the system-
based sequence <n . . . m> (4 tokens out of 6).
In summary, Timmys representation now includes a syllabic as well as a
word level, and both vowel and consonant features now are represented.
Because vowels show some autonomy, they must be individually specied,
although <a> remains the default vowel. For this reason we are forced to posit a
skeletal tier for Timmy at this point. By denition, this tier must include both
vowel ([+syllabic]) and consonant ([syllabic]) slots. Vowels are individually
specied and therefore can be considered to be the rst genuine segments in
Timmys system. Consonant slots remain unspecied, because order of occur-
rence of features remains predictable (represented at word level). Planar segre-
gation prevents these autosegmental consonant feature specications from
affecting vowels as well.
In a later 16-month session, the combinatorial potential of the system is
unleashed (see Table 9.2). Whereas earlier combined only with <t> or, in
second syllable, <k> or <l>, it now occurs in monosyllabic <bi> bee, <ki> key,
and disyllabic <mami> mummy, <nimi> Simon as well as <kaki> coffee, <kuki>
computer, and <kibi> good boy. The previously extra-systemic vowel has
entered the system, occurring in monosyllables <tu> Drew, juice, toe, <mu>
moo, moon (tokens of both words still marked by extra-systemic lengthening
and low pitch) as well as disyllabic <kuki> computer, <kaku> bicycle. In
addition, sequences of <a . . . i>, <a . . . u>, and are permitted as
well as <k . . . b> and <n . . . m>.
In addition, the child produces an unusual contrast of <ba 'bi> bottle but <ba
'bu> bubble.
9
The second syllables, <bi> versus <bu>, appear to carry the
medial [d]:[b] contrast of the model into the vowel of the childs production
as nonlabial [i] versus labial [u], suggesting attention to the adult contrast as
well as continuing constraints on possible within-word syllable sequences.
Since no instances of the sequence labial . . . nonlabial have yet appeared
(cf. Simon <nimi>, money coin <mami>), the expected solution
*<badi> vs. <babu> is not available. Furthermore, word forms combining
both consonant and vowel contrast do not yet occur either (good boy, for
example, might otherwise be produced as *<kubi> or *<kuba>).
On the word level we nowhave multiple examples of mono- versus disyllabic
forms, corresponding in each case to one versus more syllables in the model
(monosyllabic ball, boat, bee, car, key, Drew, juice, toe, moo(n), and neck/sun
both produced as <na> versus disyllabic balloon, bottle, bubble, coffee, good
boy, money, quack-quack, and Simon, but also computer and bicycle). However,
sequences of [low] . . . [+low] and of [+labial] . . . [labial] are still excluded.
Consonants have yet to be completely released from the restrictions of harmony
and predictable order to become segments in their own right.
Voicing of obstruents is only partially consistent and never contrastive. All
tokens of and <> syllables are voiced; similarly <da>, <di>, and <du> are
voiced, regardless of the model (including toe and toot(-toot)). However, the
dorsal words vary, with [k] for the old word quack-quack; [g] for such new
words as bicycle, coffee, and key; and variation between [k] and [g] for car.
Computer is imitated as [kug]. On the other hand, [g] is produced consistently
in continuing use of the proto-word <ga> (originally Great Gable) and in the
new (proto-)word golly-goo (as Timmys mother dubbed it), which originated
as helicopter but is now used for Humpty-Dumpty, pictures of elephants and
squirrels, and elsewhere. A new, highly variable word is lizard, reported by the
mother as zazoo, but occurring in the session (in repeated reference to pictures
of a large caterpillar) as iterated <jai> or <wa> and, once only, [ja:du]
(zazoo).
We have followed Timmys emergent phonology from his rst pair of
undifferentiated <ba> words to a fairly well-developed system including exten-
sive feature geometry and autosegmental levels of representation with word,
syllable, and segmental units. We have seen how he gradually added rst
consonant, then vowel features to his lexical representations. And we have
seen how the emergence of feature contrast, rst in vowels, then in consonants,
was followed by a combinatorial explosion, reecting the logic of the under-
lying system very much as outlined in Lindblom (1992).
Table 9.2. Inventory of a childs syllables (Timmy)
Consonant at syllable onset
Age (in months) <t> <k> <> <s> <j> <m> <n> <j> <w>
9 ba
10 ba
11 ba ka
12 ba ka
13 ba ka
14 ba ka ja
15a ba ka a
j a na ja
15b ba ka a
j a ma na ja
16a ba ta ka a sa
j a ma na ja
ti ki
16b ba ta ka a sa
j a ma na ja (wa)
bi ti ki mi ni
ba tu ku (dzu) mu
Note: a = earlier in month, b = later in month.
Alice: the palatal pattern
Alices phonological development illustrates the emergence of a far more
complex initial structure. Alice appears to organize her phonology on two
independent planes at once. At the autosegmental level, she gradually works
her mastery of the motoric control needed to produce a palatal glide, [j], into a
word-based palatal melody. The melody may be seen to evolve gradually out of
the words which Alice selects or attempts to produce, words that naturally
accommodate her preferred phonetic gesture, [j] (e.g., hi, baby at 10 months). At
rst the melody is applied inconsistently in production, to whole words (no
[nj]: 9 mos., bottle [bj]: 11 mos.), then to both words and syllables (dolly
[dali:], elephant [ni a]: 13 mos.). At each of these levels Alice explores a
variety of options. We identify the beginnings of a phonological system at 14
months, when a single relatively consistent word production pattern begins to
be applied to a range of different words. Some examples:
baby [be:bi] blanket [bi]
bottle [bad i] mommy [ma:i]
Bonnie [ban i]
The pattern found in these productions is related to the form of the adult models,
but cannot derive from them alone; it ts closely with baby and Bonnie, but
distorts blanket and mommy.
Figure 9.3 tracks over time the emergence and decline of the various elements
which participate in the formation of the systemin evidence at 14 months. Three
patterns are isolated and identied in the order from most to least complex or
inclusive (beginning with the bottom-most panel): <CoVCi> , or polysyllabic
word shapes including a nal [i] (e.g., baby, Bonnie, mommy); <Vi>, or mono-
syllabic word shapes including a front rising diphthong (e.g., hi, Ais the childs
nickname, which rhymes with haze); and <jV>, or any other word shapes that
include the glide yod ([j]: e.g., yumyum).
At 8 months only babble was produced; there were no identiable words.
Babble vocalizations tended to include yod to an uncommonly great extent
(24 percent vs. a mean of 6 percent for nine other American infants; McCune
and Vihman 1987). At 9 months three words were identied, realized in seven
tokens. The <jV> pattern is incorporated into two of these tokens: hello/hi[ya]
[hije], no [n:j]. No other palatal pattern is used in words in this month.
At 10 months we see the rst and strikingly high use of the <Vi> pattern in
words, accounting for 50 percent of all word tokens. Two likely adult sources of
this pattern for Alice are hi and baby, words she produces accurately, with
syllable count and nuclear-syllable consonant and vowel matching the adult
form. Both words constitute plausible models for the shaping of a palatal
articulatory gesture in the direction of adult speech.
Figure 9.4 displays the use of all palatal patterns combined in babble as compared
with words to facilitate tracing the emergence of a word schema out of the babble
repertoire. Here we see a sharp increase in word production at 10 months (to twenty
wordtokens), withproportionate increase inpalatal patternuse, while babblingitself
shows little change. Babbling shapes foreshadowword shapes, as we see in the rst
wave of palatal patterning in babbling at 812 months perhaps reecting the
100
90
80
70
60
50
40
30
20
10
P
e
r
c
e
n
t

u
s
e

o
f

p
a
l
a
t
a
l

p
a
t
t
e
r
n
0
<jV>
8 9 10 11 12 13 14 15 16
100
90
80
70
60
50
40
30
20
10
P
e
r
c
e
n
t

u
s
e

o
f

p
a
l
a
t
a
l

p
a
t
t
e
r
n
0
8 9 10 11 12 13 14 15 16
<VI>
8 9 10 11 12 13 14 15 16
100
90
80
70
60
50
40
30
20
10
P
e
r
c
e
n
t

u
s
e

o
f

p
a
l
a
t
a
l

p
a
t
t
e
r
n
0
Months of age
<CoVCi>
Figure 9.3. Palatal pattern use in words
childs global auditory representation of words like mommy, daddy, baby, hi, and her
own nickname, Ais. Babbling also reects newly emergent patterns rst attempted
in word production (as illustrated in Elbers and Ton 1985), as we see in the second
wave of palatal patterning in babbling, at 1316 months, covering the period in
which word production shows a dramatic palatal-pattern-based increase. Only the
emerging lexicon shows sharp or apparently categorical changes from month to
month, however, reecting the ongoing phonological work of construction and
reorganization or systematization.
Looking over the changing patterns in Figure 9.3 in the remaining months, we
see that all three patterns are used in at least 10 percent of Alices word tokens at
11 months. From 12 months on (when word production drops temporarily), the
<jV> pattern is replaced by the other more differentiated patterns. Two patterns
compete at 13 months, the <CoVCi> pattern dominating from 14 months on.
100
90
80
70
60
50
40
30
20
10
0
100
90
80
70
60
50
40
30
20
10
0
8 9 10 11 12 13 14 15 16
Months of age
Palatalization in word tokens
Word tokens
Babbling
Palatalization
T
o
t
a
l

b
a
b
b
l
e

v
o
c
a
l
i
z
a
t
i
o
n
s
T
o
t
a
l

w
o
r
d

t
o
k
e
n
s
Figure 9.4. Raw frequency of vocalizations and palatal pattern use
Whereas one or at most two different palatal patterns had been used in earlier
months, at 14 months a full range of possibilities is explored, with some words
varying across subpatterns. For example, tokens of daddy vary between the fairly
accurate [tdi], a <Vi> form[tardi], and a more fully palatal [jrji]. Similarly, the
word hi, a staple of Alices lexicon for ve months, now receives experimental
shaping into [ha:ji]. It is worth noting that at 14 months, when the majority of her
productions (42 tokens) are disyllabic [i]-nal word shapes, almost all of these are
in relatively good conformity with the adult model; similarly, the diphthongal
production of words such as bye and eye (also kay and nigh-nigh at 15 months)
owes as much to the model and the childs evident experience of a match as to
assimilatory or creative reconstruction by the child.
Until nowAlices words seemto have been selected, at least in part, on the basis
of the increased salience of palatals. However, her palatalization pattern has
appeared to exist in some sense separate from its manifestation in any particular
word, because it is imposed inconsistently on various portions of different words,
even within the same recording session, and its effects vary from one token to the
next, even of the same word. Palatalization appears to have the status of an
autosegmental melody for Alice, independent of the segments on which it operates.
Furthermore, this palatal melody can be seen as a direct outgrowth of the phonetic
gesture [j] which marked Alices vocal production at 810 months.
We propose that the vocal motor scheme which is rst manifest as [j] is
gradually shaped into the more extensive and exible palatal melody expressed
in both mono- and disyllabic words from 14 months on. Until a range of
different words are produced in a phonologically consistent way, we have
vocal production under the dual inuence of the infants own previous vocal-
izations and prior experience of adult vocalizations, each embedded in a
familiar situation of use. Once a stable word production pattern is established,
we can infer the existence of a phonological system. This system is emergent at
14 months, as Alice experiments with different palatal patterns for old,
previously palatalized word shapes.
At 15 months Alices palatal pattern is no longer independent of the under-
lying word shapes; it has begun to shape them. Words now begin to change in a
way that cannot be accounted for by the shape of the adult model. A number of
the relatively accurate earlier shapes have been replaced by disyllabic [i]-nal
renditions that also incorporate yod, which had been submerged earlier in the
more abstract realization of palatal articulation affecting different parts of the
word vowel nucleus ([Vi]), palatalized stop or nasal ([d ], [n ]), and nal [i].
Now we see a resurgence of intervocalic yod in forms such as blanket [baji] and
dolly and daddy [daji]. The manifestation of palatalization has become system-
atic, reminiscent of the spread of tones in tone languages or of nasalization in a
language like Guarani. This is the sign of an active phonological system
exerting an inuence on production patterns, where earlier those patterns
merely reected various possible interactions between Alices well-developed
motoric capacity and her auditory experience of the adult language.
A clear picture of Alices phonological system now emerges. Palatalization is
redundant everywhere except in word onsets. Medial consonant features never-
theless will be specied in most words, as they may emerge in any one of three
ways: intact, palatalized, or replaced by Alices default, [j]. First-syllable vowels
must be specied, because these are not predictable. However, each initial syllable
must include two morae, the rst of which will be specied whereas the second
will be either lled by the default palatalization or left unrealized. Production
factors may determine which of the consonantal and vocalic options occurs as the
output form. Alice seems to allot motoric attention to the onset of a word pattern
and then, in the remainder, allowher default palatal to ll in wherever attention or
articulatory agility fails. In some cases (e.g., lady at 16 months) even the initial
consonant is lexically unrealized at times, and is therefore supplied with the
default [j]. Vowels in second syllables remain unspecied as they are redundantly
palatal, with one or two extra-systemic exceptions (e.g., one production of blanket
as [bo], hammer as [hv:a]).
These aspects of Alices system provide an interesting contrast to
Timmys. Her motor skills are more developed than his, as reected in far
greater phonotactic and phonetic variety. She is willing to experiment (at 14
months) with various ways to integrate her palatalization pattern into her
phonological system. Like Timmy, she is able to underspecify some elements
in her lexical representation because she has a preexisting well-practiced
motor pattern which will ll in for them. However, she shows greater
variability in use of her pattern; production variables as well as lexical
organization inuence the extent to which it is used for any given word
token. Whereas the concept of planar segregation can serve to simplify our
model of Timmys phonology, it cannot account for Alices system. The
primary reason it cannot is that our autosegmental representation of Alices
pattern must include branching within the syllable, with two morae available
for her frequent diphthongs and occasional CVC forms (e.g., clean [kin]). In
the absence of a simple CVCV phonotactic pattern, the relative order of
consonants and vowels is not predictable, and planar segregation is ruled
out. It is in any case unnecessary here, because the palatal melody affects
consonants and vowels alike.
Let us nowconsider the last month for which phonological data are available,
16 months. Alices polysyllables have begun to return to the balance reected at
14 months; there are no new examples of regression affecting formerly
correct forms, although experimentation continues in words which are dif-
cult for the child, such as mommy, now sometimes produced with medial [m],
sometimes with [n. ], and lady, in which the initial lateral and the medial stop are
both subject to replacement by yod in variant tokens. The word that gives Alice
the most trouble is elephant. The toy set which engaged Alice at each recording
session included a Jack-in-the-box elephant. In a classic illustration of a child
valiantly attempting a situationally salient word with an alien phonological
shape, Alice progressed from [e:], [a] or [ni] at 13 months, to [anj],
[aij] or [ j
i ] at 16 months still far from the model, yet hardly random

productions. In fact, it seems clear that Alice is guided here, as elsewhere, by the
interacting effects of her emergent range of available word production routines
and her auditory impression or perceptual experience of the adult word for the
exact nature of which we have no independent information.
There is yet another categorical advance in phonological organization to
be observed, however. Throughout the period studied the few forms that show
no palatal patterning typically are monosyllables; longer forms that fail to
match the full <CoVCi> pattern generally incorporate some individual ele-
ment of palatality, as in the elephant tokens noted above. At 16 months,
nonpalatal word shapes are exclusively monosyllables. However, these non-
palatal patterns now have begun to receive phonological attention, and two
new patterns can be discerned: (1) monosyllables with a low back vowel
nucleus and (2) monosyllables with nuclear [i]. For (1), models appear to
include down, man, (grand)pa, and up, all produced relatively accurately.
Other words are assimilated to this pattern: duck [t] and even milk [m:]
(perhaps assimilated to the back [a/] pattern instead of the expected [i]
pattern due to the inuence of postvocalic velarized []. The second pattern
constitutes a new departure for Alices palatal gesture: the preceding conso-
nant typically is palatalized, and even the word shoe ([i]) is assimilated to it
(here it is the initial consonant that appears to dictate the choice of the front-
vowel pattern).
Thus, we are proposing partial lexical representations for Alice which may
include specications at the word or syllable level. In addition, we propose that
unspecied vowel slots are redundantly lled in with a palatal feature speci-
cation on output, and that the same palatal feature specication may variably
affect other portions of the word, depending in part on production variables.
In a sense, the groundwork for a syllable versus word-level phonological
distinction was already laid at 13 months, when Alices phonetic-level palatal
pattern, based on a prior vocal motor scheme, was rst optionally applied to
either one or both syllables in disyllabic words, demonstrating emergent con-
trol of units smaller than the word. At 14 months we see the rst consistent
treatment of a range of different words; we date the internalization of a sche-
matic phonological pattern or system to this month. Before that, there is no
reason to claim that a phonological system is operational, because the only
evident inuences on her productions remain the motoric preference for palatal-
ization and its apparent perceptual salience (based on word selection for
production). It is only at 14 months, when Alice begins restructuring words
from previously more adultlike shapes into forms that t a consistent pattern of
her own, that these inuences can no longer provide an adequate account of her
developing phonology. At this point, the convergence of preferred motoric
organization and specic auditory bias is represented in an internal system
independent of particular word forms; at 15 months this system has begun to
actively assimilate both old and new word forms.
Discussion
Now that we have considered Timmy and Alices development in detail, we can
return to Figure 9.1. We notice that the production patterns governing the
childrens early words are surprisingly similar (see also Molly in Vihman and
Velleman 1989: [kak: ] clock, glasses; [ n:i] Nicky). Simple CV(CV) word
shapes persist in both lexicons. Both children have discovered segments,
but have retained some word-level feature specications. When Timmys
default labial occurs only once in a word, it is in medial position. Alices
consonants continue to be subject to palatalization at the segmental, syllabic,
or word level, as we see in her variable productions of lady and daddy. When her
phonetic default consonant, palatal yod, occurs only once in a word, however, it
too is medial.
We take the preferred or default consonant types to be overlearned motori-
cally, given their roots in babble and their overgeneralized use in words.
Although Timmy and Alice arrive at their word production patterns in different
ways, it is striking that, in both cases, the default consonant nds a slot in the
second syllable. This can be accounted for in formal terms by directional
association: lexically specied consonant features are rst associated with the
initial (left-most) consonant slot, whereas the medial consonant is assigned
what is left, a default, or nothing. Phonetically speaking, the child can be seen
here to devote the most motoric attention to word-initial position, producing the
most challenging or less familiar elements in that position and relying on
more automatic production options (harmony, default, or omission) for medial
consonants (see Branigan 1976: contrasts should occur rst in the most
favorable environment for their unaffected production . . . Consonants in initial
position . . . receive the rst neural commands and therefore [should] be least
inuenced by preceding positions of the articulators [p. 129]). The production
pattern described in Vihman and Velleman (1989) also ts this account, as does
the [l]-default pattern described for a French child in Vihman (1993b).
Timmy and Alice enter into adultlike vocal production with different pho-
netic resources. The adult words that appear to be most salient to them, based on
the relative accuracy and frequency of their early production attempts, are
correspondingly different. What they share in the early months of word pro-
duction is the apparent reliance, for imitation or spontaneous word production
in restricted contexts, on a small number of well-practiced (preferred) pho-
netic patterns or vocal motor schemes, identiable in context as modeled on
similarly patterned, apparently preselected or matching adult words.
Following the rst few months of word production (six months for Timmy,
ve for Alice), there is a sharp change in production (best seen in Table 9.2 for
Timmy, at 15 months, and in Figure 9.4 for Alice, at 14 months). Now a
generalized word production pattern has emerged, and a wider range of word
shapes begins to be attempted, expanding the phonological repertoire (accom-
modation, in Piagets terms) but also adapting adult models to t the childs
pattern (assimilation). At this point, it is hard to deny the psychological reality
of a shift to a rudimentary phonological system. This system is built around the
preexisting vocal motor schemes and the adult models by which those early
vocal patterns were shaped, and is thus continuous with prelinguistic vocal
production. However, the system has a logic and a dynamic of its own. Once the
system has evolved, integrating the childs vocal resources and activating a
particular pattern of interconnections among potential phonetic gestures, new
words are more readily admitted to the childs active lexicon, and production
accuracy decreases while boldness or departures from a set pattern increase
(see Lewis 1936, g. 1; Leonard, Schwartz, Folger & Wilcox 1978; Schwartz
and Leonard 1982).
Conclusion
We have suggested that the advent of phonological systematicity is rooted in
cognitive advances. Increases in representational capacity have been found to
be identiable in the nonverbal domain through the more exible use of
symbols in combinatorial play (McCune 1995). This results in a more dif-
ferentiated experience of the situation, of the events and objects of interest
and the accompanying vocal forms, and also in greater processing space with
which the child may compare and contrast his or her own vocal patterns and
those of adults. The change in representational capacity affects the use of words,
which now refer to a range of instances or tokens for a single name or word
type, including relational words whose appropriate use presupposes awareness
of alternative potential states (e.g., all gone, more imply a mental comparison of
presence vs. absence: McCune-Nicolich 1981a). The relatively rapid increase in
vocabulary often observed at this point in the childs development may reect
new understanding of the function of language, but also may result from a new
capacity for phonological internal representation, which simplies word pro-
duction by creating a small set of routines to be followed, requiring perceptual
and motoric attention only to selected aspects of the target word.
The connectionist model of cognitive functioning recently has been invoked
in a number of studies as a promising way to model childrens phonological
production (Berg 1992; Menn and Matthei 1992; Menn, Markey, Mozer and
Lewis 1993; Stemberger 1992). We view such models as particularly apt for the
early period we have described here, in which the childs vocal exploration
through babbling results in the laying down of preferred neural pathways
(with motor connections activating auditory connections as a result of self-
monitoring), thereby setting connection strengths that will inuence phonetic
patterning at least for the rst several months of word production. Connectionist
models also are helpful in conceptualizing or accounting for the interaction
between extra-systemic and systemic elements and the high variability associ-
ated with production shortly before a newpattern becomes established as part of
the system (see Thelen 1989 and Vihman and Velleman 1989; Figures 9.19.4).
We argue, however, that something changes at the point we have identi-
ed as the beginnings of phonological system. Underlying the change is
cognitive advance. The effect on language is a qualitative, categorical change
in function (generalized or referential word use) as well as form (the emergence
of a generalized word production pattern). Whereas the combined effects of
vocal motor scheme and auditorily salient patterns were sufcient to account for
the individual vocal shapes of early, context-limited words, an internal repre-
sentation minimally modeled with a subset of the phonological structures
characteristic of the adult lexicon, including autosegmental levels, consonant
and vowel contrasts, feature specications must be invoked to account for the
regularities found in the word production patterns characteristic of the more
advanced stage of context-exible word use. Once a rudimentary phonological
system has begun to cohere, systemic pressure takes its place alongside pro-
duction capacity (range of vocal motor schemes) and the inuence of the adult
model (auditory salience) as a primary factor shaping not only vocal production
but all subsequent phonological development.
notes
1. For a discussion of why faces are special to infants, and a possible connection
between early enactive imitation and social identity, see Meltzoff and Moore
(1993).
2. Labials were nearly 10 percent more common in early words than in contempora-
neous babble in the four languages investigated by Boysson-Bardies and Vihman
(1991).
3. On the other hand, Goad (1992) argues on the basis of feature geometry that the
segment must be a primitive.
4. The notation < > will be used in referring to the word shapes used by this child, to
cover a fairly wide range of phonetic variants. The initial stop, whether labial or
dorsal, is produced at rst with the full continuum of voice onset time possibilities,
from fully voiced to voiceless aspirated. The nuclear vowel may also be voiceless,
although only after voiceless onset consonant. The initial syllable may be preceded by
a short support or onset syllable, typically a low vowel or schwa; the low vowels
range from front to back.
5. The asymmetry in Timmys selection, resulting in a rst lexicon of /b/- and /k/-words,
ts within the reported universals governing stop systems: where one or two gaps are
found in a voiced and voiceless series, it is voiceless /p/ and/or voiced /g/ that are most
likely to be missing (Gamkrelidze 1975).
6. The adult models for the most frequently occurring <ka(ka)> words provide both one-
and two-syllable target forms (cat/kitty and quack[-quack]). Monosyllabic car sel-
dom elicits a disyllabic response, whereas most tokens of baa(baa) are iterated as
<baba> when the word is rst produced at 15 months.
7. The syllable [a] was an occasional phonetic variant for <ba> from 11 months on;
now it is drawn on for a newly emergent contrast and takes its place in the lexical/
phonological system.
8. There are reasonable phonetic arguments both for and against positing different
feature representations for consonants and vowels. Timmys data do not provide
strong phonological evidence either way, whereas Alices data show the palatalizing
inuence of vowels on consonants as well as some apparent fronting and backing
inuence on vowel choice from neighboring consonants (see Alice: the palatal
pattern, below).
9. The intention of maintaining contrast seems clear, given the consistent effortful extra
stress on the [i] of <babi> bottle (all four tokens) as well as on the nal vowel of all 27
uses of <babu> bubble (produced as the child attempts to catch the bubbles his mother
is blowing), phonetically a front rounded [y] in this form only.
References
Aslin, R. (1993). Segmentation of uent speech into words: learning models and the role
of maternal input. In B. de Boysson-Bardies, S. de Schonen, P. Jusczyk,
P. MacNeilage, and J. Morton (eds.), Changes in speech and face processing in
infancy: a glimpse at developmental mechanisms of cognition, pp. 30515.
Dordrecht: Kluwer.
Baillargeon, R. (1987). Object permanence in 3 and 4 month old infants.
Developmental Psychology, 23, 65564.
Bates, E., Benigni, L., Bretherton, I., Camaioni, L., and Volterra, V. (1979). The
emergence of symbols. New York: Academic Press.
Berg, T. (1992). Phonological harmony as a processing problem. Journal of Child
Bloom, L. (1991). Representation and expression. In N. Krasnegor, D. Rumbaugh, R.
Scheifelbusch, and M. Studdert-Kennedy (eds.), Biological and behavioral deter-
minants of language development, pp. 11740. Hillsdale, NJ: Lawrence Erlbaum.
investigation of vowel formants in babbling. Journal of Child Language, 16, 117.
Branigan, G. (1976). Syllabic structure and the acquisition of consonants: the great
conspiracy in word formation. Journal of Psycholinguistic Research, 5, 11733.
Chiat, S. (1979). The role of the word in phonological development. Linguistics, 17,
591610.
Davis, B. L. and MacNeilage, P. F. (1990). Acquisition of correct vowel production: a
quantitative case study. Journal of Speech and Hearing Research, 33, 1627.
Elbers, L. and Ton, J. (1985). Play pen monologues: the interplay of words and babbles in
the rst words period. Journal of Child Language, 12, 55165.
Fee, E. J. (1991). Prosodic morphology in rst language acquisition. Paper presented at
the Boston University Conference on Language Development, Boston, MA,
October.
Ferguson, C. A. (1964). Baby talk in six languages. American Anthropologist, 66 (6, Part
2), 10314.
(1978). Learning to pronounce: The earliest stages of phonological development in
the child. In F. D. Minie and L. L. Lloyd (eds.), Communicative and cognitive
abilities early behavioral assessment, pp. 27397. Baltimore: University Park
Press.
ition. Language, 51, 41939. Reprinted in this volume as chapter 4.
Fernald, A. (1984). The perceptual and affective salience of mothers speech to infants.
In L. Feagans, C. Garvey, and R. Golinkoff (eds.), The origins and growth of
communication, pp. 529. Norwood, NJ: Ablex.
Fernald, A. (1991). Prosody in speech to children: prelinguistic and linguistic functions. In
R. Vasta (ed.), Annals of child development, vol. 8, pp. 4380. London: Jessica Kingsley.
Fikkert, P. (1991). Well-formedness conditions in child phonology: a look at metathesis.
Paper presented at Crossing Boundaries: Formal and Functional Determinants of
Language Acquisition, Tbingen, Germany, October.
Gamkrelidze, T. V. (1975). On the correlation of stops and fricatives in a phonological
system. Lingua, 35, 23161.
Garnica, O. K. (1977). Some prosodic and paralinguistic features of speech to young
children. In C. E. Snow & C. A. Ferguson (eds.), Talking to children: language
input and acquisition. Cambridge University Press.
Goad, H. (1992). Learnability and inventory specic underspecication. Paper presented
at the meeting of the Linguistic Society of America, Philadelphia, January.
Goldeld, B. (1993). Noun bias in maternal speech to one-year-olds. Journal of Child
Language, 20, 3599.
Goldsmith, J. A. (1979). The aims of autosegmental phonology. In D. A. Dinnsen (ed.),
Current approaches to phonological theory, pp. 20222. Bloomington: Indiana
University Press.
(1990). Autosegmental and metrical phonology. Oxford: Blackwell.
Iverson, G. and Wheeler, D. (1987). Hierarchical structures in child phonology. Lingua,
73, 24357.
Kent, R. D. (1992). The biology of phonological development. In C. A. Ferguson,
research, implications, pp. 6590. Parkton, MD: York Press.
Kent, R. D. and Bauer, H. R. (1985). Vocalizations of one year olds. Journal of Child
Language, 12, 491526.
Kiparsky, P. and Menn, L. (1977). On the acquisition of phonology. In J. Macnamara
(ed.), Language learning and thought, pp. 4778. New York: Academic Press.
Leslie, A. M. (1987). Pretense and representation: the origins of theory of mind.
Psychological Review, 4, 41226.
Leonard, L. B., Schwartz, R. G., Folger, M. K., and Wilcox, M. J. (1978). Some aspects
of child phonology in imitative and spontaneous speech. Journal of Child
Language, 5, 40315.
Lewis, M. M. (1936). Infant speech: a study of the beginning of language. New York:
Harcourt Brace.
Lindblom, B. (1992). Phonological units as adaptive emergents of lexical development.
In C. A. Ferguson, L. Menn, and C. Stoel-Gammon (eds.), Phonological develop-
ment: models, research, implications, pp. 13163. Parkton, MD: York Press.
Lle, C. (1992). A parametrical view of harmony and reduplication processes in child
phonology. Unpublished MS.
(1988). The sound shape of early lexical representations. In M. D. Smith and
J. L. Locke (eds.), The emergent lexicon, pp. 322. New York: Academic Press.
Parkton, MD: York Press.
(1993). Developmental changes in the acquisition of phonology. In B. de Boysson-
Bardies, S. de Schonen, P. Jusczyk, P. MacNeilage, and J. Morton (eds.), Changes
in speech and face processing in infancy: a glimpse at developmental mechanisms
of cognition, pp. 43549. Dordrecht: Kluwer.
Macken, M. A. and Ferguson, C. A. (1983). Cognitive aspects of phonological develop-
ment: model, evidence and issues. In K. E. Nelson (ed.), Childrens language, vol.
4, pp. 25682. Hillsdale, NJ: Lawrence Erlbaum.
MacNeilage, P. F. and Davis, B. L. (1990). Acquisition of speech production: Frames,
then content. In M. Jeannerod (ed.), Attention and performance XIII: motor repre-
sentation and control, pp. 45376. Hillsdale, NJ: Lawrence Erlbaum.
McCarthy, J. (1989). Linear order in phonological representation. Linguistic Inquiry, 20,
7199.
McCune, L. (1992). First words. In C. A. Ferguson, L. Menn, and C. Stoel-Gammon
McCune, L. (1995). A normative study of representational play at the transition to
language. Development Psychology, 31, 198206.
McCune, L. and Vihman, M. M. (1987). Vocal motor schemes. Papers and Reports on
McCune-Nicolich, L. (1977). Beyond sensorimotor intelligence: assessment of
symbolic maturity through analysis of pretend play. Merrill-Palmer Quarterly,
23, 89101.
(1981a). The cognitive bases of relational words in the single word period. Journal of
(1981b). Toward symbolic functioning. Child Development, 52, 78597.
McCune-Nicolich, L. and Bruskin, C. (1981). Combinatorial competency in symbolic
play and language. In K. Rubin (ed.), The play of children: current theory and
research, pp. 522. Basel: Karger.
McDonough, J. and Myers, S. (1991). Consonant harmony and planar segregation in
child language. Unpublished manuscript, UCLA and University of Texas at
Austin.
Meltzoff, A. and Moore, M. K. (1993). Why faces are special to infants on connecting
the attraction of faces and infants ability for imitation and cross-modal processing.
In B. de Boysson-Bardies, S. de Schonen, P. Jusczyk, P. MacNeilage, and J. Morton
(eds.), Changes in speech and face processing in infancy: a glimpse at develop-
mental mechanisms of cognition, pp. 21125. Dordrecht: Kluwer.
Menn, L. (1978). Phonological units in beginning speech. In A. Bell and J. B. Hooper
(eds.), Syllables and segments, pp. 15771. Amsterdam: North-Holland.
Menn, L., Markey, K., Mozer, M., and Lewis, C. (1993). Connectionist modeling and the
microstructure of phonological development: a progress report. In B. de Boysson-
Bardies, S. de Schonen, P. Jusczyk, P. MacNeilage, and J. Morton (eds.), Changes
in speech and face processing in infancy: a glimpse at developmental mechanisms
of cognition, pp. 42133. Dordrecht: Kluwer.
Mulford, R. (1988). First words of the blind child. In M. D. Smith & J. L. Locke (eds.),
The emergent lexicon: the childs development of a linguistic vocabulary,
pp. 293338. New York: Academic Press.
Ohala, D. (1991). A unied theory of nal consonant deletion in early child speech.
Unpublished MS, University of Arizona.
Piaget, J. (1962). Play, dreams and imitation in childhood. New York: W. W. Norton.
Ramsay, D. and Campos, J. (1978). The onset of representation and entry into stage 6 of
object permanence development. Developmental Psychology, 52, 78597.
Roug, L., Landberg, I., and Lundberg, L.-J. (1989). Phonetic development in early
infancy: a study of four Swedish children during the rst eighteen months of life.
Rovee-Collier, C., Sullivan, M. W., Enright, M., Lucas, D., and Fagen, J. W. (1980).
Reactivation of infant memory. Science, 208, 115962.
Sagey, E. (1986). The representation of features and relations in nonlinear phonology.
Unpublished PhD dissertation, Massachusetts Institute of Technology, Cambridge,
MA.
Sartre, J -P. (1966). The psychology of imagination, trans. B. Frechtman. New York:
Washington Square Press. (Original work published in 1948.)
Schwartz, R. G. and Leonard, L. B. (1982). Do children pick and choose? An examin-
ation of phonological selection and avoidance in early lexical acquisition. Journal
of Child Language, 9, 31936.
Spencer, A. (1986). Towards a theory of phonological development. Lingua, 68, 338.
Stemberger, J. P. (1992). A connectionist view of child phonology: phonological pro-
cessing without phonological processes. In C. A. Ferguson, L. Menn, and C. Stoel-
Gammon (eds.), Phonological development: Models, research, implications,
pp. 16589. Parkton, MD: York Press.
Stemberger, J. P. and Stoel-Gammon, C. (1991). The underspecication of coronals:
evidence from language acquisition and performance errors. In C. Paradis and J.-F.
Prunet (eds.), Phonetics and phonology, Vol. 3: The special status of coronals,
Stoel-Gammon, C. and Otomo, K. (1986). Babbling development of hearing impaired
and normally hearing subjects. Journal of Speech and Hearing Disorders, 51,
3341.
Thelen, E. (1989). Self-organization in developmental processes: can systems
approaches work? In M. R. Gunnar and E. Thelen (eds.), Systems and development:
the Minnesota Symposia on Child Psychology, vol. 22, pp. 77117. Hillsdale, NJ:
Lawrence Erlbaum.
Uzgiris, I. and Hunt, J. (1975). Assessment in infancy: ordinal scales of psychological
development. Champaign: University of Illinois Press.
Van Gulick, R. (1982). Mental representation: a functionalist view. Pacic Philosophical
Quarterly, 63, 320.
Velleman, S.L. (1992). A nonlinear model of early harmony and metathesis. Paper
presented at the meeting of the Linguistic Society of America. Philadelphia, PA,
January.
Vihman, M. M. (1976). From pre-speech to speech: on early phonology. Papers and
(1992). Early syllables and the construction of phonology. In C. A. Ferguson,
research, implications, pp. 393422. Parkton, MD: York Press.
(1993a). The construction of a phonological system. In B. de Boysson-Bardies, S. de
Schonen, P. Jusczyk, P. MacNeilage, and J. Morton (eds.), Changes in speech and
face processing in infancy: a glimpse at developmental mechanisms of cognition,
pp. 41119. Dordrecht: Kluwer.
(1993b). Variable paths to early word production. Journal of Phonetics, 21, 6182.
Vihman, M. M. and McCune, L. (1994). When is a word a word? Journal of Child
Vihman, M. M., Macken, M. A., Miller, R., Simmons, H., and Miller, J. (1985). From
babbling to speech: a re-assessment of the continuity issue. Language, 61, 397445.
Vihman, M. M. and Miller, R. (1988). Words and babble at the threshold of lexical
acquisition. In M. D. Smith and J. L. Locke (eds.), The emergent lexicon,
(1981). A tentative developmental model of phonological representation. In T. Myers,
J. Laver, and J. Anderson (eds.), The cognitive representation of speech,
pp. 32333. Amsterdam: North-Holland.
Werner, H. and Kaplan, B. (1963). Symbol formation. New York: John Wiley.
(Reprinted, 1984. Hillsdale, NJ: Lawrence Erlbaum.)
10 Beyond early words: word template
development in Brazilian Portuguese
Daniela Oliveira-Guimares
Introduction
As rst observed by Ferguson and Farwell (1975), the very rst words a child
produces are relatively accurate. It is as if children, in targeting or choosing
words, take into account their own articulatory limitations (see also Ferguson,
Peizer, and Weeks 1973). As the vocabulary expands, word forms become less
accurate and more similar to one another (Vihman 1996; Vihman and Kunnari
2006), with the emergence of phonological patterning, or word templates, in
later words. At this point, according to Vihman and Velleman (2000), system-
aticity can be identied, reecting the construction of a rst phonology.
Phonological development, as measured by gains in accuracy, is therefore
nonlinear: as in other areas of development, children show regression, variation
and periods of instability (Thelen and Smith 1994).
Several studies of phonological development have focused on babbling,
the transition from babbling to rst words, and the distinction between early
and later words in the rst word period, with a focus on the emergence of
templates in the latter (e.g., Stoel-Gammon and Cooper 1984; Vihman 1993;
Vihman, Velleman, and McCune 1994; Vihman and Velleman 2000; Keren-
Portnoy, Majorano, and Vihman 2008). However, few studies have traced
changes in the way that templates are expressed over time, or followed their
decline and disappearance (Macken 1979 and Vihman and Vihman 2011 are
two such studies; see also Priestly 1977). The goal of this study is to analyze
the emergence and evolution of word templates through two case studies of
children acquiring Brazilian Portuguese and to discuss the role of the word
and the segment in phonological acquisition. According to Ferguson and
Farwell (1975), Menn (1983), and Vihman and Croft (2007), the rst unit
of phonological organization corresponds to the word. In her detailed longi-
tudinal analysis of one childs early phonology Macken (1979) argues that
templates capture important facts about early development but that in
later stages the segment replaces the word as the basic structural unit.
Similarly, Vihman and Vihman (2011) trace the rise and decline of templates
in the word forms of a bilingual child. We provide further evidence here of
the gradual fading of templates as the segment emerges as an important unit of
representation.
291
In this chapter we evaluate the emergence and extension of phonological
templates to new words as the childs vocabulary increases. We address the
following questions: (1) How do templates emerge and then gradually fade over
time? (2) Do templates themselves change over time? (3) What is the role of
templates in the course of phonological development? (4) How and when do
children advance from whole-word to segment-based phonology? (5) Do seg-
ments replace the word as a unit when templates disappear, or does the word
remain a structural unit in the phonological grammar even in later stages?
In the next section, we provide a brief description of Brazilian Portuguese
phonology. We then give an overview of previous studies of Brazilian
Portuguese phonological development. In the remaining sections we analyze
the phonological development of each of two boys over the course of one year.
As we will see, Lucass data provide some interesting evidence for the way that
templates emerge and change over time. On the other hand, Paulos data make it
possible to observe the transition from a holistic to a more detailed
representation.
Brazilian Portuguese phonetics and phonology
The Brazilian Portuguese phonetic inventory includes 29 consonants (including
the allophones [] and []), two offglides, 7 full oral vowels (plus three reduced
vowels) and 5 nasal vowels, according to Cristfaro-Silva (2001; see
Table 10.1, below). The affricates / / and / / occur before vowel /i/ (tia
/ia/, dia /ia/). These sounds are in complementary distribution with the
alveolar stop consonants /t/ and /d/. Affricate occurrence is an important
dialectal marker in Brazilian Portuguese (Cristfaro-Silva 2001) and is fully
realized in Belo Horizonte. Also there is a variable group of r sounds. In initial
position, the r can be pronounced as /h/ or /x/. In nal syllable, the r agrees in
voicing with the following consonant: porta door /phta/ or /pxta/; carga
load [kaga] or [kaga]. The alternation between [x, ] and [h, ] is dialect
Table 10.1. Phonetic inventory of Brazilian Portuguese
Consonants (C) Vowels (V)
Plosive p b, t d, k g i u
Nasal m, n, e o
Fricative f v, s, z, , , x, , h,
Affricate , a
Tap
Laterals l, , l
Retroex
Approximant
Glides w, j
292 Daniela Oliveira-Guimares
dependent, with the latter pair of variants generally occurring in Belo Horizonte.
Furthermore, a retroex or a tap can, in some dialects, be pronounced in nal
syllable position, as for example porta door /pta/ or /pta/.
The permitted syllable structure is (C)(C)V(C)(C). The syllabic nucleus
can be occupied by any vowel. The second consonant in a cluster can be
a lateral [l] or a tap [], as for example prato dish [pat], or ower
[oh]. In postvocalic position, permitted consonants are limited to the
following:
(1) a lateral /l/, or a glide /w/, as in balde bucket, variably pronounced as
/bai/ ~ /bali/ ~ /bawi/, the last form being the most common in the
Belo Horizonte dialect, which the subjects of this study are acquiring ;
(2) a voicedvoiceless pair of velar fricatives /x, / and glottal fricatives /h, /, a
tap // or a retroex //, as in parte /pah/ ~ /pax/~ /pa/~/pa/;
(3) a sibilant /s/ or /z/ and in some dialects the palatal sibilants // and // (Bisol
2005), as in paz peace /pas/ ~ /pa/. The voicing of the sibilant assimilates
to that of the following consonant (e.g., pasta briefcase /pasta/ and rasga
it rips /hazga/). The alternation between/s, z/ and /, / in nal syllable is
dialect dependent.
Thus, the coda in Brazilian Portuguese is quite variable and dependent on the
particular dialect. Besides these consonants, the (off-)glides /w/ and /j/, as in
mau bad [maw] and pai father [paj] can occupy the postvocalic position.
In stressed position any vowel (oral or nasal) can occur, as shown in
Table 10. 1. In nonstressed, word-nal position, there is a reduction in the
vowel system, such that only the vowels /i/, //, and // occur, as in pato duck
/pat/, abre, open /ab/, and casa house /kaz/. In pretonic position we nd
variation between the vowels /, e, i/, and /o, , u/, as in menino boy: /mnin/ ~
/menin/ ~ /minin/. This variation is dialect-specic and lexically conditioned
(Oliveira 1995; Viegas 2001).
In most Brazilian Portuguese words, the stress falls on the penultimate
syllable. However, there is a preference for stress on the nal syllable when it
is closed (Bisol 2005). Stress on the antepenultimate syllable is the least
common.
Apostvocalic nasal consonant occurs as part of a child phonological template
analyzed in this chapter. Thus, we will describe the nasal consonant and the
nasal vowels in Brazilian Portuguese in a little more detail. Three nasal con-
sonants /n/, /m/, and // occur in onset position. Palatal // is restricted to that
position and is found only in loanwords. According to Mattoso-Camara (1970),
there are two types of nasal vowels in Brazilian Portuguese. The rst type is
what Mattoso-Camara calls pure nasalization. This is the same as the nasal
vowel in the French word bon /b/, which contrasts with an oral vowel in beau
/bo/ for example: lana he/she throws /ls/, which contrasts with laa he/
she binds /las/. In lana the nasal consonant which initially conditioned the
nasalization does not occur; the word form */lnsa/ is not found (and is not
permissible). The second type of nasalization is conditioned by contact between
Word template development in Brazilian Portuguese 293
a vowel and a nasal consonant in the following syllable, for example, lama
mud /lma/. Although more abstract phonemic analysis postulates a nasal
segment in the rst-syllable coda of cases like lana (Mattoso-Camara 1970),
no nasal consonant can be perceived in such cases. According to Medeiros and
Demolin (2006), acoustic analysis shows a transitional nasal murmur between
the nasal vowel and the following consonant, which reects properties of the
postulated nasal consonant.
Previous studies of the acquisition of Brazilian Portuguese
Brazilian Portuguese phonological studies of children have so far mainly
focused on the order of acquisition of segments and syllables. Lamprecht
et al. (2004) report the results of several studies carried out using two corpora
of Brazilian Portuguese child speech collected in the south of Brazil and
including some 400 children, aged 1 to 7 years. The ndings reported by
Lamprecht et al. are in line with studies of other languages in showing early
acquisition of the vowels /a/, /i/ and /u/, which occupy extreme positions in the
vowel triangle (Rangel 2002), and of plosives and nasals before other conso-
nants. Labials and alveolars are acquired before the velar plosives and palatal
nasal. Similarly, the labial fricatives are acquired before the coronals. Coda uses
of /h, , x, / emerge rst in word-nal position and later in word-medial
position. Finally, the liquid, the most challenging sound class for children in
Brazilian Portuguese, is the last to be acquired, with the lateral /l/ being acquired
before the tap //.
Miranda (2007) studied the acquisition of clusters, specically, clusters of
plosive followed by a tap. She focused mainly on the relationship between
phonological variation and cluster acquisition. In her two studies, one cross-
sectional, the other longitudinal, she found that the CCV syllable was acquired
between the ages of 3;0 and 5;2. She also observed that the acquisition of
clusters is phonetically gradual and that both lexical item and token frequency
play an important role in this process.
Brazilian Portuguese acquisition research has focused on cross-sectional
studies of typically developing children, with the main goal of establishing a
developmental prole, a useful basis from which to identify possible delay.
Most of the studies evaluate the production of children over 2;0. However,
Teixeira and Davis (2002) carried out a study of early sound patterns in
Brazilian Portuguese based on a diary study of two typically developing
children, between the ages of 12 and 36 months. They found that coronal was
the most strongly represented place, followed by labial. The mid vowels as well
as front vowels were highly frequent. They observed that CV is the most
frequent syllable type in childrens rst words in Brazilian Portuguese and
reduplication occurs with high frequency. No study to date has investigated
template use in Brazilian Portuguese.
Methods
Participants
The participants in this double case study are part of a longitudinal
investigation of the acquisition of affricates in Brazilian Portuguese in four
typically developing children (Oliveira-Guimares 2008). The children
all have monolingual exposure to Brazilian Portuguese in Belo Horizonte, a
large city in the state of Minas Gerais. The parents are also native to Belo
Horizonte and have university degrees. The children were rst recorded
when the mother reported a survey vocabulary of 20 to 25 words, which can
be expected to correspond to production of no more than 10 words within
a half-hour recorded session (Vihman and Miller 1988). The focus of this
study are two boys, Lucas and Paulo, aged 1;9 and 1;11, respectively, at the
outset.
Table 10.2 presents an overview of Lucass and Paulos vocabulary develop-
ment over the twelve months of the study.
Although the two mothers reported about the same number of words in each
childs vocabulary before the rst recording session, there is a consistent
difference between numbers of word types and MLU over the twelve sessions,
reecting the individual developmental paths of the two children. Paulos
lexical and syntactic development is much more rapid than that of Lucas,
and at the end of the recording period his lexicon is almost twice the size of
that of Lucas and his MLU much larger.
Table 10.2. Overview of Lucass and Paulos vocabulary development
Square brackets indicate total combinations in each session.
Lucas Paulo
Session Age Types Tokens
MLU in
combinations Age Types Tokens
MLU in
combinations
1 1;9.21 8 15 1;11.13 20 88
2 1;10.31 17 42 2;0.20 32 72 1.06 [2]
3 2;0.2 26 75 2;1.28 69 190 1.5 [17]
4 2;1.2 43 109 1.2 [6] 2;2.20 54 169 1.4 [14]
5 2;1.27 72 115 1.3 [8] 2;3.22 89 178 1.3 [17]
6 2;2.26 78 127 1.4 [7] 2;4.21 153 176 2.01 [67]
7 2; 3.29 76 122 1.3 [8] 2;5.20 137 200 1.9 [50]
8 2;4.26 101 147 1.4 [24] 2;6.19 170 193 2.28 [78]
9 2;5.57 108 171 1.6 [40] 2;7.20 161 180 2.21 [90]
10 2;6.28 95 131 1.4 [20] 2;8.18 209 254 2.35 [127]
11 2;7.26 119 151 1.7 [36] 2;9.22 226 299 2.76 [136]
12 2;8.25 119 153 1.8 [39] 2;10.20 229 243 2.93 [147]
Data collection and transcription
Participants were audio- and video-recorded monthly, for half-hour free-play
caregiverchild interactions (usually motherchild, but the mother was sometimes
replaced by a regular babysitter) and researcherchild interactions. The recordings
took place in the childrens homes. The children played either with their own toys
or with toys provided by the researcher. The equipment used in the recordings was
a digital tape recorder (Sony TCD D8), with a microphone attached to the childs
shirt, and a digital video camera (Sony Digital 8 DCR-TRV110).
All words identied were transcribed and submitted to acoustic analysis
using Praat software (www.praat.org). Utterances interrupted by noise (includ-
ing overlapping speech) or not easily audible were not transcribed. Acoustic
analysis was used to verify the transcribers perception. Spectrograms were
made of all productions. The focus of the data analysis is on the identication of
individual child patterns or word templates.
In the analysis we will identify a template when the childs words begin to
resemble each other more than is expected, given the target forms attempted.
Words attributed to templates are categorized as either selected or adapted
(Vihman and Velleman 2000). Selected refers to words that are a relatively good
match to their adult target and at the same time t the childs template. Adapted
refers to less accurate words that have been modied to t into the childs template.
Results
Case study 1: Lucass phonological development
In session 1, Lucass words are generally quite similar to the target, as we see in
Table 10.3. All but one of them are relatively accurate or selected. The only
word in Lucass data which differs sharply from the target in session 1 is the
name Gisele /izl/, pronounced as [zizi]. This word occurs alongside similar
reduplicated word forms from the input (e.g., caca).
Table 10.3. Lucass words in session 1 (2;0.2)
Orthography Gloss Adult form Child form
1. Cac (name) kaka tata
2. esse this es e
3. Gisele (name) zl zizi
4. mame mother mmj mj, mmj
5. n no no nnw nnw
6. oi hi oj oj
7. papai father papaj papaj
8. Zizi (name) zizi ii
In session 2 (1;10.31) there are two radically changed words, both proper
nouns: Gabriel /gabiw/ [bebe] and Pedro /pedu/ [dudu]. At this point we
see the beginning of systematization as a template related to the CVCVformhas
emerged, in which a syllable is reduplicated (C
1
V
1
C
1
V
1
). Template formation is
evidenced not only by the two adapted words just mentioned but also by the
selected words beb baby [bebe], vov grandma [vv], and vov grand-
father [vovo], which are target-like and t the reduplicated pattern. This
suggests that selected words have an important role in shaping a template,
while adapted words extend and reinforce it.
In session 3 (2;0.3) the number of adapted words increases (see Appendix).
Lucass adapted words mostly take the CVCV reduplicated shape, as seen in
Fernanda /fehnda/ [veve] and Izabel /izabw/ [pp] (again, both names).
One CV from the target word, mostly, but not always, from the stressed syllable,
is reduplicated to give the child form. From sessions 1 to 3 we see a rise in
the number of words that t this pattern, with a corresponding decrease in accuracy,
reecting a corresponding increase in systematicity (Vihman and Velleman 2000).
In the following sessions, in parallel with use of the reduplicated template, a
new template emerges which involves the production of a consonant in word-
nal position. CVC syllables are an important characteristic of Lucass phono-
logical inventory. Lucas produced codas from the very rst session (e.g., tira
[dim], take pode [p] you can) . Starting with session 4, heavy-syllable
production (i.e., production of syllables with a postvocalic segment) increases
and becomes part of a genuine template which applies to new words, both
selected and adapted.
Table 10.4 shows the frequency of productions with closed syllables in
Lucass data over twelve months (in percentage of word tokens). We can
identify a template of the form (CV)CVC. The rst syllable (CV) is optional
and the word-nal C stands for a consonant of one of two kinds:
Table 10.4. Closed syllable production
Session Age
N closed
syllables
% closed
syllables
1 1;9.21 2 13
2 1;10.31 7 16
3 2;0.2 9 12
4 2;1.2 26 23
5 2;1.27 33 28
6 2;2.26 46 36
7 2; 3.29 56 45
8 2;4.26 78 53
9 2;5.27 77 45
10 2;6.28 85 64
11 2;7.26 71 47
12 2;8.25 68 44
(1) Sibilant. This occurs when there is either a sibilant or an affricate in the
target: cf. bruxa witch /bua/, pronounced as [bu], and pode is able,
can /p/, pronounced as [p]. The use of a sibilant in word-nal
position is in most cases closely based on the target.
(2) [m] or [w], as in the word sapo toad /sap/, produced as [m] or [saw].
Coda [m] does not occur in Brazilian Portuguese.
In the following gures we present separately the percentage of each of these
coda types. Figure 10.1 presents the proportion of sibilants in coda compared to
general coda production in twelve sessions. It includes the palatal sibilant []
and the alveolar sibilant [s], as Lucas varies production between a palatal and an
alveolar form.
In Figure 10.1 we see that the proportion of sibilants in coda position
generally decreases from the rst session on. In the following sessions new
types of coda emerge, represented by [m] and [w]. Figure 10.2 shows the
emergence of a coda [m], which is not target-like.
In Lucass early sessions, nasals occur in codas only in words that include any
kind of nasalization, such as the nasal onsets in mame mother /mam/
[mamm] and no no /n/ [nm]. In both cases the coda nasal can be seen
as a consequence of the spreading of the nasalization feature or gesture; in other
words, it is a phonetically motivated phenomenon. In later sessions, coda nasals
become highly productive. Table 10.5 presents some examples of nasal and
glide occurrences in word-nal position.
1
10
0
20
30
40
50

%

O
v
e
r

a
l
l

c
l
o
s
e
d

s
y
l
l
a
b
l
e
s
60
70
80
90
100
2 3 4 5 6
Session
7 8 9 10 11 12
Figure 10.1. Percentage of sibilants in coda position over all closed syllables
In some words the nasal coda competes with glide [w], e.g., bicicleta
bicycle (/adem/ session 7, /etw/ session 9), peteca shuttlecock (/abebem/
session 6, /t/ session 8). Figure 10.3 illustrates the competition between
nasal and glide in word-nal position, over twelve sessions. The gure shows
that, in most sessions, as one type of coda increases in frequency, the other
decreases.
Figure 10.3 displays the percentage of nasal and glide [w] production as a
proportion of all codas over time (twelve sessions). It shows that production of
glide [w] overtook nasal production from session 8 onward. This generally
meant an increase in accuracy, or similarity between child form and adult target.
However, it is important to note that although a glide [w] is part of the adult
phonotactics, it occurs in Lucass data even in words where there is no [w], such
as bicicleta bicycle /bisiklta/, produced as [etew].
Table 10.5. Examples of words with nal [m] and [w]
Session Orthography Gloss Target form Child form
4 (2;1.2) tira take it ia diw, dim
5 (2;1.27) cavalo horse kaval avam
6 (2;2.26) a peteca shuttlecock aptka abebem
7 (2;3.29) bicicleta bike bisiklta adm
8 (2;4.26) peteca shuttlecock ptka taw
9 (2;5.57) bicicleta bike bisiklta etew
10 (2;6.28) coca-cola coca-cola kkakla ttw
11 (2;7.26) sapo toad sap m, saw
12(2;8.25) bola ball bla bw, bm
80
70
60
50
40
30
20
10
0
1 2 3 4 5 6
Session
7 8 9 10 11 12
Coda in general
Coda m

%

O
v
e
r

a
l
l

w
o
r
d
s
Figure 10.2. Emergence and evolution of coda [m] over twelve sessions
Lucass data are in some ways reminiscent of those reported in Priestly (1977).
Priestly observed that from age 1;10 to 2;2 many of his son Christophers forms
had a medial [j], although they corresponded to words in the adult language with
no medial glide (e.g., chocolate [kajak], panda [pajan]). According to the author,
his sons forms reect early attempts to produce polysyllabic words that require
sequential production of two or more different consonants.
This reasoning may also be applied in Lucass case: why did Lucas choose
the consonant [m], despite the fact that it is not present in coda position in adult
targets? Could this be related to ease of articulation? Although [m] is not
encountered as a target coda, it does occur in syllable onsets (24 percent of
the adult target forms of words that Lucas produced with coda [m] have some
kind of nasalization). Coda [m] could reect an attempt to maintain nasality.
Examining the phonetic contexts in which Lucas uses word-nal [m] or [w]
can also help us to understand whether his template use is a strategy for dealing
with difcult words (as Priestly interprets his sons pattern of use). Table 10.6
shows the main contexts in which a coda consonant occurs (with percent of
word tokens), based on segments present in the target form that might condition
the occurrence of the nasal or [w] in word-nal position. Nasal (N), in the
second column, refers to any nasalisation in the target word, including a nasal
consonant or a nasalized vowel, as for example, Aninha [an]). The other
contexts refer to segments present in the word-nal syllable. The third column
U refers to any target word that ends in vowel [] (not a glide), such as: sapo
toad /sapu/. Cluster (CCV), liquid (LV), fricative (FV), and velar (KV) refer to
words that have one of these in the nal syllable, such as tigre tiger /ig/,
bola ball /bla/, estava I was /istava/, peteca shuttlecock /ptka/.
0
1 2 3 4 5 6 7 8 9 10 11 12
10
20
30
40
50
60
70
Session
Pattern m
Pattern u

%

O
v
e
r

a
l
l

c
l
o
s
e
d

s
y
l
l
a
b
l
e
s
Figure 10.3. Production of [w] and [m] in word-nal position: percentage
over all closed syllables
Table 10.6 provides an overview of coda production in all sessions. If
we analyaze Lucass data session by session we see that in sessions 2 to 4,
words with [m] in word-nal position generally have some form of nasal-
ization in the adult target, e.g., vo lets go/vw/, produced as [mm]. Thus
the nasal coda is at rst conditioned by any kind of nasalization in the
word. However, other conditioning factors for the nasal coda are: (1) the
vowel [/u] in the nal syllable (note that [] marks masculine gender in
Portuguese and is thus of very high occurrence), or another labial somewhere
in the word (coca-cola and tira are the only two words in Table 10.5 that
have no labial); and (2) the presence of difcult consonants. In most
sessions Lucas uses a nasal to complete words with difcult consonants,
such as those with liquids (tira take it /ir/) and words ending in CCV
(tigre tiger /ig/). Nasal [m] and vowel [u] occur at the same time,
operating during the same period.
The main conditioning factor for [w] occurrence in word-nal position
seems to be [] in the nal syllable (49 percent). In words like carro car
/kahu/ [kaw] we cannot determine whether Lucas is applying his template or
simply omitting the consonant [h] and reducing the vowel [] to a glide.
However, in a word like tigre tiger /ig/ [iw] Lucas inserts a glide [w]
where there is no [u] in the target. In this case it appears, as with Priestlys
subject (1977), that Lucas is using a kind of strategy to deal with an articu-
latory challenge.
In summary, Lucas rst developed a simple reduplicative CVCV template.
Later, a template with coda [m] emerged, which was gradually supplemented
and nally supplanted by the coda with glide [w]. The template with coda [w]
persists through the last of the twelve recorded sessions.
Table 10.6. Conditioning factors for nasal [m] and glide [w] occurrence
in word-nal position
N U CCV LV FV KV Other Total
[m] 23 (24%) 34 (35%) 13 (13%) 15 (15%) 3 (3%) 3 (3%) 6 (7%) 97
[w] 10 (9%) 50 (49%) 6 (5%) 6 (5%) 12 (11%) 5 (4%) 16 (15%) 105
Total 33 (17%) 84 (41%) 19 (9%) 21 (10%) 15 (8%) 8 (4%) 22 (10%) 202
Notes
The percentage is based on total words pronounced by Lucas with coda.
N nasal anywhere in the word, e.g., Aninha (name) /n/
U vowel /u/ in words , e.g., sapo toad /sapu/
CCV consonant vowel sequence in words nal syllable, e.g., tigre tiger /ig/
LV liquid vowel sequence in words nal syllable, e.g., tesoura scissors /izoa/
FV fricative vowel sequence in words nal syllable, e.g., I was /istava/
KV velar vowel sequence in words nal syllable, e.g., peteca shuttlecock /ptka/.
Overview of Lucass segmental inventory
To complete our analysis of Lucass phonological development (and for the
purposes of comparison with Paulo), in Table 10.7 we present an overview of
his segmental inventory at three points: session 1 (1;9.21), session 6 (2;2.26),
and session 12 (2;8.25). In the inventory we include any segment that occurs at
least twice in each word position: onset, medial, and nal for consonants, and
stressed and unstressed for vowels.
In session 1, Lucass phonological inventory is small. Of the stop consonants
we nd only the unvoiced [p] and [t]. Sibilants occur in word-initial and word-
medial positions. In word-nal position the palatal sibilant occurs in place of the
alveolar sibilant (in the adult language, in the Belo Horizonte dialect, only the
alveolar sibilant occurs in coda position). In session 6, voiced stop consonants
[b] and [d] are present. The velar consonants [k] and [g], the tap [], and the
fricatives [h, ] are not produced in any position in the twelve sessions analyzed.
In session 12 lateral [l] occurs in medial position. In onset position we also nd
affricates that match the target. In stressed position the vowel inventory is
complete from session 6, but in unstressed position open medial vowels are
yet to be consolidated even in session 12.
Table 10.7. Lucass segmental inventory at three points
Session 1
(1;9.21)
Session 6
(2;2.26)
Session 12
(2;8.25)
Consonant
inventory
Word onset (p) (t)
[z] ()
(m) (n)
p b t d
(f) v[]
m n
()
p b t d
f v s []
m n
()
Word medial (p) (t)
[z]
(m) (n)
p b t d
(f) (v) s []
m n
p b t d
f v [s] [] ()
m n
l
Word nal [] w j [] [m] w j [] [m] w j
Vowel
inventory
Stressed (i)
(e) (o)
()
a
i u
e o

a
u
e o

a
Unstressed
syllable
(i) ( u)
( a)
i u
e (o)
a
i u
e o
[] ()
a
[ ] = segments produced only as substitutions for adult segment, never as match-to-target
( ) = phones produced in only one word
Note that Lucass rst words mainly have a weakstrong stress pattern,
despite the fact that the adult targets tend to have penultimate stress (see
Table 10.8, which quanties word length and stress in the rst three sessions).
Case study 2: Paulos phonological development
Table 10.9 provides an illustration of the word templates found in Paulos
sessions 1 (1;11.13) and 2. Note that although some words are clearly selected
and others adapted, others may be close to the model and thus accurate or
selected in some respects but modied to t the childs template, or adapted, in
other respects.
The strongest template in these two sessions involves full reduplication, with
an optional offglide that occurs when stress falls in nal position. Like Lucas,
Paulo usually changes the stress from the penultimate to the nal syllable
(Table 10.10, based on the rst session). There is also a CV(V) template.
The reduplication template occurs in the rst two sessions, in both selected
and adapted words. The CV template occurs in the rst session in selected
words only and in the second session in adapted as well as selected words.
Table 10.9. Examples of Paulos word templates
Template
Session 1
(1;11)
Adult
form
Child
form
Session 2
(2;00)
Child
form
C
1
V
1
C
1
V
1
(V)
(Selected)
vov
grandmother
vv [vv] mame mother mamj [mmj]
papai father papaj [papaj] vov grandfather vovo [vovo]
CVCV(V)
(Adapted)
Letcia (name) leisa [tata] tartaruga turtle tartag [tata]
Luciana (name) lusina [uu] Roseli (name) hozeli [ii]
CV(V)
(Selected)
po bread pw [p] p foot p [p]
banho bath bj [bj] por to put por [po]
CV(V)
(Adapted)
Silene (name) silen [e]
Edmar (name) emar [ma]
Table 10.8. Word length and stress for Lucas
Adult target Child production
Monosyllables 18/72 (25%) 45/83 (37%)
Iambic 30/ 72 (42%) 42/83 (51%)
Trochaic 24/72 (33%) 4/83 (5%)
Table 10.11 gives an overview of the rst two sessions, those in which these
templates are strongest.
In both sessions the majority of words are selected. Harmony (a phonological
process that provides consonant assimilation in place or manner) and redupli-
cation are the main processes that give rise to the C
1
V
1
C
1
V
1
(V) template in
Paulos data. Here reduplication (or harmony) is generally, but not always, a
consequence of regressive assimilation, in which the child copies or antici-
pates the following segment. Examples, from sessions 1 and 2, are given in (1):
(1) Orthography Target form Child form
sapo sap papu
copo kp pp
Another notable characteristic of Paulos rst words is the very extensive use of
labial consonants. Table 10.12 and Figure 10.4 show the proportion of use of
labials and dentals over ve sessions, over all words produced. We count as
labials words that have labial consonants exclusively, including [m], [b], [p],
[v], [f], and as dentals words that have alveolar/dental consonants exclusively,
including [d], [t], [s], and [z]. Labials + dentals refer to words which include
both, while other refers to all words which include a labial or dental with
another consonant type.
As we can see, Paulos rst words consist mainly of labials. The proportion of
labials gradually decreases while alveolar production increases. In fact, in
sessions 4 (2;2.20) and 5 (2;3.22) we can see a kind of alveolarization of
Paulos consonants, as both alveopalatal fricatives and velar stops tend to be
substituted by alveolar stops (e.g., girafa [diafa], aqui [ati]). This is related to
the move from templates to segments, as the alveolarization seems to be related
to liberation from consonant harmony.
Table 10.11. Overview of Paulos sessions 1 (1;11) and 2 (2;00)
Session
Total number
of words
Selected
items
Adapted
items
Reduplicated
CVCV CV Others
1 24 15 (62%) 9 (38%) 13 (54%) 7 (30%) 4 (16%)
2 31 20 (64%) 11 (31%) 9 (29%) 10 (32%) 12 (39%)
Table 10.10. Word length and stress for Paulo
Adult target Child production
Monosyllables 7/64 (11%) 20/86 (23%)
Iambic 27/64 (42%) 42/86 (49%)
Trochaic 30/64 (47%) 24/86 (28%)
Paulos reduplicated and CVpatterns are mainly seen in the rst two sessions.
In the following sessions we see some word forms adapted to t the CVCV
template generally fossilized forms (i.e., inaccurate child forms that remain as
such for some time, even when new words with the same sounds are produced
accurately), such as proper nouns. Table 10.13 shows all of the adapted forms
found in session 3 (2;1.28).
Of the 69 word types that occur in session 3, 17 child forms (29 percent)
exhibit some kind of harmony or reduplication or, in other words, some degree
of adaptation. The other words exhibit no harmony and most are produced with
variegated consonants. In session 4 (2;2.20) some adapted reduplicated forms
can still be found. These forms are illustrated in Table 10.14.
There are fewer adapted words in session 4 than in session 3 and new words
adapted to a reduplicated template are rare in the following sessions. Some
Table 10.12. Paulos labial and dental production over ve sessions
1
Session 1 Session 2 Session 3 Session 4 Session 5
N (%) N (%) N (%) N (%) N (%)
Labials only 17 (58) 26 (64) 40 (48) 27 (34) 16 (11)
Dentals only 4 (14) 5 (12) 25 (30) 29 (37) 52 (37)
Labials + dentals 4 (14) 5 (12) 1 (1) 7 (9) 25 (18)
Other 4 (14) 5 (12) 18 (21) 15 (20) 47 (33)
5 4 3 2 1
10
20
30
40
50
60
70
0
Labials
Dentals
Session

%

O
v
e
r

a
l
l

w
o
r
d
s
Figure 10.4 Labials and dentals over ve sessions
words are crystallized, frozen or entrenched forms, which retain the tem-
plate shape for a long time. These forms may have been adopted and reinforced
by the adults, in their child-directed speech. In the last sessions templates apply
only to specic words that appear to have become entrenched, such as proper
nouns. For example, at 1;11 tartaruga turtle /tahtauga/ is produced as [tata]
and Roseli (name) /hozeli/ as [ii]; at 2;10 tartaruga has become [tahtaga]
but Roseli remains unchanged. We can also see the child using adapted
CVCV forms in some phrases, such as, for example, comeu bolo eat cake
Table 10.13. Paulos template words produced in session 3 (2;1.28)
Orthography Gloss Adult form Childs form
1 chapu hat apw ppw
2 chapu it is the hat apw ppw
3 chapu pega chape hat take the hat apwp apw pwpppw
4 Dani tartaruga Danis turtle danitartauga danitata
5 dormi na gua sleep in water duminaaga mimiaaka
6 dormindo sleeping dumin mimiunu
7 dormir sleep dumi amimi, mimi
8 Duda (name) duda dutu
9 Felipe (name) lip pipi
10 o Felipe it is Felipe ulipi upipi
11 hipoptamo hippopotamus hipoptm ppp, papap
12 Letcia (name) leisa tata
13 peixe sh pe pepi
14 prncipe prince psip ppi
15 sapato shoe sapat papapu
16 tartaruga turtle tartauga tata
17 umbigo navel bigu bibi
Table 10.14. Paulos template words produced in session 4 (2;2.20)
Orthography Gloss Adult form Childs form
1 acabou nish akabo bobo, abo
2 atender answer atde dede
3 dormer sleep dumi mimi
4 dormir na gua sleep in the water duminaaga mimiaga
5 duro hard duu dutu
6 entendeu understood tde dedew
7 peixe sh pe pepe
8 Roseli deu Roseli gave hozelidew iideu
9 sapato shoe sapatu papapu
10 tartaruga turtle tartauga tata
11 por tartaruga put turtle portartauga potata
/kumewbolu/ [memewbolu] (session 7, 2;5.20). In this case only the word
comeu is adapted to a reduplicated form as part of a combination. This provides
some evidence for the template being deployed as a way of dealing with
combinations, but the current data are insufcient for testing the extent of this
possible template function.
As the use of templates decreases in the later sessions new forms are emerg-
ing. These new forms do not t into a simple CVCV pattern or any other set
schema or template. At this point we can more usefully analyze the relation
between child forms and adult targets by making reference to phonological
processes or rules, which apply segment by segment, with a straightforward
alignment between target word and child form. Thus we see the template
gradually fading out and the segment emerging as an important unit of phono-
logical organisation. To illustrate, we show the evolution of the phonetic forms
of the word tartaruga turtle, in Table 10.15.
The word tartaruga is rst adapted to a reduplicated template as [tata]
(sessions 1 to 4). In session 4 the adapted form is tted into a combination:
por tartaruga /pohtahtauga/ [potata] put turtle. In session 5 we see the
emergence of a new representation. In this session tartaruga takes three differ-
ent forms, which suggests an unstable representation, characteristic of a period
of transition in this case, from template to segment-based phonology. From
session 5 on, the relationship between Paulos productions of tartaruga and the
adult target can be analyzed in terms of substitution and deletion processes;
there is no longer any reason to refer to the application of a holistic template. We
can see these changes in other words, such as Luciana (name) /lusina/, which
is rst pronounced as [uu], tted into a reduplicated template, but which in
session 7 is pronounced variably as [sina] and [na]. Here again we can see
the emergence of a form that is closer to the adult target.
Table 10.15. Changes in the production of one word over twelve
sessions, both in isolation and as part of combinations
Session
tartaruga turtle
tartauga Adult form
Gloss (if not
simply turtle
1, 2, and 3 tata
4 potata por tartaruga put turtle
tata [pohtahtauga]
5 tataluga,
tatautugu, tatau
6 tauga
7 tatauga
tatauganadanaag
w
a tartaruga nada na gua
[tahtauganadanagwa]
turtle sleeps
in the water
8 and 9 tatauga
11 tatauga
tataugapkena tartaruga pequena small turtle
Another important piece of evidence for the emergence of segmental
phonology is variation in the production of segments in different tokens of
the same word. From session 3 (2;1.28) onward variation in word forms
becomes more frequent. The same target segment is pronounced differently
in different words and even in different tokens of the same word. For
example, the word bruxa witch is pronounced as [bua], [buta], and
[bua] in the same session. It is as if Paulo were playing with sounds, trying
out different forms. We can see many examples of variation in the production
of specic segments, as in the case of the word bruxa. In Table 10.16 we
provide one example from each of the later sessions to show this variation,
which we take to reect the process of reorganization and the gradual
emergence of the segment as a unit in the childs phonological grammar.
Note that in those words which contain more than one supraglottal consonant,
the variability involves one of those consonants only; in almost no case does
it reect the inuence of the place of articulation of the other consonants in
the word, in contrast to the effect of harmony and reduplication seen in the
earlier sessions. Thus, this variability shows that at this point the child is
dealing with segments, not with holistic templates.
Table 10.16. Word form variability in nine sessions
Orthography Gloss Session Adult target
a
Child forms Target segment
1. tirar take it 4 ia ia
ia
dia
affricate
2. foi he/she went 5 foj toj
oj
labiodental fricative
3. cad Where is it? 6 kade kade
ade
de
velar stop
4. Letcia (name) 7 leisa isja
sa
affricate
5. aqui here 8 aki akia
ati
velar stop
6. nmero number 9 nume numi
numelu
nume
ap
7. desse aqui of this here 10 desiaki desiki
deiki
alveolar fricative
8. palhao clown 11 palasu pajasu
palasu
lateral
9. trilho rail 12 til ilu
tilu
cluster
a
Target segment is underlined.
Overview of Paulos segmental inventory
To analyze Paulos move to segmental representation it is important to have an
overview of his phonological inventory. Table 10.17 shows Paulos segmental
inventory at three developmental points: session 1 (1;11.13), session 6 (2;4.21),
and session 12 (2;10.20). This table shows the small inventory of consonants in
the beginning, when most of his words occurred in a templatic form, and the
increase in his inventory around the time when the segment became a functional
unit of representation.
In session 1 only labials occur in word onset and medial positions, except for
[t] and [h] (which occur once each). The only fricative is the labiodental [v],
which occurs in both positions. In session 6, we see other fricatives emerge,
especially in word-medial position. Palatal sibilants occur only in substitution
for other consonants. The voiced velar [g] does not occur. We nd target-like
affricates in initial position and lateral liquids in initial and medial position. In
stressed position the vowel system is complete in session 6. In unstressed
syllables the open medial vowels [] and [] occur only once each. By session
12 the consonant inventory has become quite large; only the tap [] and [h]
(corresponding to orthographic and historical r) are missing. The [h] found in
Table 10.17. Paulos segmental inventory at three points
Session 1
(1;11.13)
Session 6
(2;4.21)
Session 12
(2;10.20)
Consonant inventory Word onset p b (t)
v
m
b p t d k
(s) [] v
m n
(l)
p b t d k g
f v s z
m n
(l)
Word medial p b
v
m
(h)
p b t d k g
(f)(v)( s) (r) [] []
m n
l l
p b t d k g
f v s z
m n
l l
Word nal j w s () j w s j w
Vowel inventory Stressed (u)
(e) o
a
i u
e o
a
i u
e o

a
Unstressed syllable i u
(e) o
()
a
i u
e o
() ()
a
i u
e o
()
a
[ ] segments produced only as substitutions for adult segment, never as match-to-target
( ) segments produced in only one word
the rst session does not occur in sessions 6 or 12. In session 12 there are again
target-like affricates in initial and medial position. The bilabial, alveolar, and
velar stops occur accurately in onset position in the last session analyzed, as do
the labiodental, alveolar, and palatal fricatives.
A summary of template evolution
This study has explored the use of word templates in two children acquiring
Brazilian Portuguese. We have described the emergence and evolution of
templates over one year.
Both children started out using a reduplicative CVCV template. This pattern
is quite common in babbling and in rst words in most languages (MacNeilage,
Davis, Kinney, and Matyear 2000). CVis generally taken to be the least marked
syllable, occurring in all languages and emerging in children as the rst adult-
like or canonical syllable (Oller 2000). The use of a form which reduplicates a
CV syllable can be explained as being especially easy, both because CVis an
early-learned syllable, motorically accessible to the child, and because CVis the
most frequent syllable type in Brazilian Portuguese (Almeida 2005). It may be
assumed that repeating such a syllable twice is also relatively simple in terms of
speech planning, easier than coordinating two different syllables of any kind.
Stress may also play a role in characterizing a childs templates: Both Lucas
and Paulo tend to move the stress from the penultimate to the nal syllable in
their word templates. However, although nal syllable stress is not the most
common pattern in Brazilian Portuguese, it is very frequent in child-directed
speech. Input frequency may therefore explain this preference in both cases.
We have seen that both of the children followed in this study converged on
reduplication as their solution for meeting the challenge of producing adult
words. However, following the period in which the reduplicated template is
most active, the developmental paths taken by the two children diverged.
For Lucas, the rst child discussed in this chapter, a new template emerges in
session 3, represented by a bilabial nasal /m/ in word-nal position. Later on,
Lucass phonological systemundergoes reorganization, with the emergence of a
new template with a coda glide [w], which competes with the template form
with coda [m] until the form with [w] gradually comes to dominate. This
competition can be explained in part by the articulatory similarity between the
two bilabial phones, [m] and [w]. In the transition from [m] to [w] use,
competition between [m] and [w] was observed even in the same word and in
the same session. Some words changed pattern over time as the new word
template was adopted. Thus in Lucass case progress is expressed as a move
from one template to another.
In the case of the second child, Paulo, the decline of the reduplicated template
can be related to the emergence of the segment as a unit of phonological
representation. During this shift to reliance on segments we see unstable
behavior, involving play with words and sounds, which resulted in a lot of
variation in the production of some words (see Table 10.16). Paulos redupli-
cation template gradually fades in the last sessions, remaining active only as
regards specic entrenched words, such as proper nouns. This is similar to the
developmental pattern in the case study reported by Macken (1979), in which a
childs template gradually faded as the child learned the contrast between
individual sounds.
General discussion
The two case studies presented here raise some issues regarding the develop-
ment of phonological knowledge which we would like to briey outline here:
the signicance of the occurrence of non-adultlike structures, variability in
word or segment production, the role of frequency in template formation, and
the relative role of the word vs. the segment in representations.
Lucass nasal-coda template is noteworthy because nasal consonants do not
occur in coda position in adult Brazilian Portuguese. This nding thus presents a
challenge for acquisition theories because the template, which Lucas uses
consistently over a period of several months, cannot derive directly from his
ambient adult language, nor are codas considered to be unmarked and thus
expected to occur in early words on the basis of such principles as the emer-
gence of the unmarked (McCarthy and Prince 1994; Gnanadesikan 2004). We
are thus faced with a case in which neither universal properties nor input
frequency can explain why a child produces a nontarget form. Thus a template
analysis can be a useful tool for discovering what sounds or structures represent
an articulatory challenge for the child, or conversely, which sounds or structures
a given child may nd easy.
There is some variation in both Lucass and Paulos data, but the variation
seems to apply to different-sized units. In Paulos case the variation may well
reect segmental learning (Ferguson and Farwell 1975). Such segmental learn-
ing means that a new, more detailed level of analysis has begun to develop, such
that the segment emerges as a functional unit alongside the word. For Paulo a
focus on mastering specic segments means that a single segment in the word,
the one being targeted perhaps, is produced with variability. This variability is
not affected by the other consonants in the word, and the variably produced
segment occupies a different place in different words, which shows that the
problem is not with any one word structure but rather with a particular segment.
Lucass productions, on the other hand, are relatively more stable throughout
the period of the study, perhaps because he has not yet moved to segmental
representations over this period. In Lucass case there is also competition
between two segments, [m] and [w], in the monosyllabic templates, which
could be taken as a case of variability in segment production. However, the
competition occurs between the holistic CV[m] or CV[w] forms, and not
between [m]s and [w]s in different positions, or even in coda position in
different word structures. The competition, in Lucass case, is not between
different renditions of a segment that appears in the target form, but rather
between two child forms, neither of which may be an accurate rendition of the
adult form. We see, then, that different types of variability or competition
between forms can be informative as to the size or identity of the unit that a
child seems to be operating with.
We have seen in Paulos data that the reduplicative template, once it has
mostly faded out of use, continues to apply only to specic words entrenched
as proper nouns. That raises the issue of the inuence of frequency on the
consolidation of a template. The frequent use of a word in a templatic form can
strengthen that form of the word, especially if it begins to be used by adults as
well as by the child. Reuse of this particular form, and even more so reuse by
more than just the child speaker, leads to this form being more readily acces-
sible, perhaps more strongly activated, more distinctly represented in memory,
than competing forms. It is likely that having the same form produced by other
speakers not only leads to this form being judged as accurate, due to its being
very close to the adult forms, but also to its creating a better dened and richer
representation, one which contains exemplars originating in more than one
voice. As long as this strengthening affects only a single word it will not
necessarily lead to the strengthening of the template, but instead will create a
phonological idiom. If more such words are reused and their representations
are strengthened in this way, such reuse could lead to the strengthening of the
template as a pattern that affects other words as well. The effect of frequency
here is parallel to that seen in language change, such that token frequency affects
changes that pertain to individual words (or entrenchment), but type frequency
affects generalization (Bybee 2001).
And nally, what status does the word have in a childs phonological system
once segments have begun to play a role as units of representation? As sug-
gested by Menn (1983), evidence that the word continues to serve as an
important unit of organization is the fact that the phonetic form mastered in
one word fails to occur in other words with similar targets. Phonological idioms,
or idiosyncratic forms produced by the child (Ferguson and Farwell 1975;
Menn 1983), would be a case in point. Paulo provides one such example. He
produced the word esse these /es/ accurately from the third session (2;1.28).
However, at the same time other words with [s] were not pronounced accurately.
For example, the same target /s/ is pronounced as /t/ in Sirlei (name) [tilei], and
as /d/ in senta sit down [dita]. There are also other cases in Paulos corpus of
variability in production of the same sound in different words, such as the
alveopalatal sibilant //, which Paulo pronounces as an affricate in feijo bean
[teu] and as an alveolar sibilant in laranja orange [neza] (session 6). This
phenomenon supports the claim that phonological acquisition is lexically grad-
ual. Although our examples largely concern early stages of phonological
development, it should be noted that studies of lexical diffusion provide evi-
dence that the word also has status in adult phonological representation (Wang
1969). Thus a phonological model is needed which recognizes segments as
functional units for both adults and children but that allows for the word in
lexical representation as well.
Both of the case studies reported here give support to the notion of the word
as an important unit of representation in child language acquisition. The
templates changed over the course of the year. For Paulo, with his more rapid
lexical advance, we see segments gradually becoming functional units of
representation and organization alongside the word. In contrast, Lucas persists
in relying on a template representation for all twelve sessions. However, we saw
that Lucas changed his templatic consonant [m] to [w], showing an adjustment
of his production toward the structure of the ambient language.
note
1. We include only the rst ve sessions because thereafter Paulo began to produce
predominantly long phrases and words with only a single place of articulation became
increasingly rare.
References
Almeida, L. S. (2005). Um estudo sobre sntese de fala para o portugus brasileiro. [A
speech synthesis study of Brazilian Portuguese]. MA thesis.
Bisol, L. (2005). Introduo a estudos de fonologia do portugus brasileiro [Introduction
to Brazilian Portuguese phonology studies]. Porto Alegre: EDIPUCRS.
Bybee, J. (2001). Phonology and language use. Cambridge University Press
Medeiros, B. R. and Demolin, D. (2006). Vogais nasais do portugus brasileiro: um
estudo de IRM [Nasal vowels in Brazilian Portuguese]. Revista da ABRALIN, 5,
13142.
Cristfaro-Silva, T. (2001). Fontica e fonologia do portugus: Roteiro de estudos e
guia de exerccios [Portuguese phonetics and phonology], 4th edn. So Paulo:
Contexto.
Ferguson, C. A, Peizer, D. B., and Weeks, T. (1973). Model-and-replica phonological
grammar of a childs rst words. Lingua, 3, 3565.
Gnanadesikan, A. E. (2004). Markedness and faithfulness constraints in child phonol-
ogy. In R. Kager, J. Pater, and W. Zonneveld (eds.), Constraints in phonological
acquisition, pp. 73108. Cambridge University Press.
Keren-Portnoy, T., Majorano, M., and Vihman, M. M. (2008). From phonetics to
phonology: the emergence of rst words in Italian. Journal of Child Language,
36, 23567.
Lamprecht, R. R., Bonilha, G. F. G., Freitas, G. C. M., Matzenauer, C. L. B., Mezzomo, C. L,
Oliveira, C. C., and Ribas, L. P. (eds.). (2004). Aquisio fonolgica do Portugus:
perl de desenvolvimento e subsdios para terapia [Phonological acquisition of
Portuguese]. So Paulo: Artmed Editora.
Lle, C. (1990). Homonymy and reduplication: on the extended availability of two
strategies in phonological acquisition. Journal of Child Language, 17, 26778.
MacNeilage, P. F., Davis, B. L., Kinney, A., and Matyear, C. L. (2000). The motor core of
speech: a comparison of serial organization patterns in infants and languages. Child
Development, 71, 15363.
Mattoso-Camara, J. (1970). Estrutura da lngua portuguesa [Structure of the Portuguese
language]. Petrpolis: Editora Vozes.
McCarthy, J. and Prince, A. (1994). The emergence of the unmarked: optimality in
prosodic morphology. In M. Gonzalez (ed.), Proceedings of the North East
Linguistic Society 24, pp. 33379. Amherst, MA: Graduate Linguistic Student
Association, University of Massachusetts.
McCune, L. and Vihman, M. M. (2001). Early phonetic and lexical development: a
productivity approach. Journal of Speech, Language and Hearing Research, 44,
67084.
Menn, L. (1983). Development of articulatory, phonetic and phonological capabilities.
In B. Butterworth (ed.), Language production, vol. 2, pp. 350. London: Academic
Miranda, I. C. (2007). Aquisio e variao estrutura de encontros consonantais
tautossilbicos [Acquisition and structured variation in tautosyllabic clusters].
Unpublished PhD dissertation, Federal University of Minas Gerais.
Oliveira, M. A. (1995). O lxico como controlador de mudanas sonoras [The lexicon as
a controller of phonological change]. Revista de estudos da linguagem, 4, 7591
Oliveira-Guimares, D. M. (2008). Percurso de construo da fonologia pela criana:
uma abordagem dinmica [Childrens construction of phonology: a dynamic
approach]. Unpublished PhD dissertation, Federal University of Minas Gerais.
Oller, D. K. (2000). The emergence of the speech capacity. Mahwah, NJ: Lawrence Erlbaum.
Rangel, G. A. (1998). Uma anlise de auto-segmental da fonologia normal: estudo
longitudinal de 3 crianas de 1:6 a 3:0 [An auto-segmental analysis of normal
phonology: A longitudinal study of 3 children from 1:6 to 3:0]. MA thesis.
(2002). Aquisio do sistema voclico do Portugus Brasileiro [Vocalic system
acquistion of Brazilian Portuguese]. Unpublished PhD dissertation, Ponticia
Universidade Catlica de Rio Grande de Sub, Porto Allegre.
Teixeira, E. R. and Davis, B. L. (2002). Early sound patterns in the speech of two
Brazilian Portuguese Speakers. Language and Speech, 45, 179204.
Thelen, E. and Smith, L. B. (1994). A dynamic systems approach to the development of
cognition and action. Cambridge, MA: MIT Press.
Viegas, M. C. (2001). O alamento de vogais e itens lexicais [Pretonic vowel raising and
lexical items]. Unpublished PhD dissertation, Federal University of Minas Gerais.
Vihman, M. M. (1993). Variable paths to early word production. Journal of Phonetics,
21, 6182.
Blackwell.
(2009). Word learning and the origins of phonological system. In S. Foster-Cohen
(ed.), Language acquisition, pp. 1539. Basingstoke: Palgrave Macmillan.
Chapter 2.
Vihman, M. M. and Kunnari, S. (2006). The sources of phonological knowledge: a
cross-linguistic perspective. Recherches linguistiques de Vincennes, 35, 13364.
Vihman, M. M. and Velleman, S. L. (1989). Phonological reorganization: A case study.
ogy? Towards an integration of linguistic and psychological approaches. In
M. Yavas (ed.), First and second language phonology, pp. 944. San Diego:
Singular Publishing. Reprinted in this volume as Chapter 9.
Vihman, M. M. and Vihman, V-A. (2011). From rst word to segments: a case study in
phonological development. In E. V. Clark and I. Arnon (eds.), How children make
linguistic generalizations: experience and variation in learning a rst language,
pp. 10933. Amsterdam: Benjamins.
Wang, W. S-Y. (1969). Competing changes as a cause of residue. Language, 45, 925.
Appendix: Lucass session 3
Orthography Gloss Target form Child form
1. Abre open it ab(r)(i) ab
2. gua water ag
w
a a
3. boi bull bo bo
4. Cac (name) kaka tata
5. chapu hat ap pep
6. desce go down ds d
7. dois two do s dos
8. no it isnt n n
9. embora away bra b
10. Fernanda (name) fernd vv
11. Izabel (name) izabw pp
12. Lucas (name) lukas us
13. Mame mom mm mm
14. no no n nm
15. n conrmation n n
16. nenm baby nen nen
17. po bread p p
18. ovo egg ov of
19. papai dad papa papa, papa
20. papel paper papw pp
21. parabns congratulations paab s pala
22. pato duck pat pap, pa
23. p foot p upa, pa
24. Pedro (name) pedr dudu
The word total here does not correspond to Table 10.2, because imitated forms are not
included here. Forms in bold reect the reduplication template.
Orthography Gloss Target form Child form
25. peixe sh pe pes
26. praia beach praa pa
27. t he/she/it is ta ta
28. tartaruga turtle tahtaruga ta
29. tchau by a ta
30. tira take off ra dili
31. uva grape uva uf
32. Viviane (name) vivin vivi
33. vov grandmother vv ff
34. vov grandfather vovo vovo
35. xixi pee i zizi
11 Templates in French
Sophie Wauquier and Naomi Yamaguchi
1. Introduction
As must be clear from the variety of analyses and approaches proposed in the
literature on phonological acquisition, there is no straightforward way to establish
the format of childrens rst phonological units or the conditions that shape them,
independent of target language. This chapter presents a proposal to account for
the acquisition of French within a template model. At the outset (Section 1), three
issues must be considered, to clarify the basis for the proposed template and the
analyses to be provided here: the lexical status of the template (1.1), phonetic
continuity vs. typological constraints on the template (1.2), and the function of the
template (1.3). In Section 2 we address what the template should be, considering
the typological characteristics of French, and in Section 3 we present three
longitudinal data sets that illustrate what the early template might be in French
and how it evolves and changes with lexical growth.
1.1 Phonological or lexical starting point: why a lexical template?
The rst problem that arises in attempting to determine the format of the rst
phonological units is whether to analyze them as essentially phonological or
essentially lexical. Can children categorize the phonological sequences that
they hear directly from the input, to construct a representation that will enable
them to recover such elements as the syllables or the phonemes that make them
up? Or must they necessarily rst resort to the lexicon (and thus acquire their
phonological knowledge through semantic bootstrapping based on the referen-
tial dimensions of the target language)?
The proposal that phonological acquisition is established through a lexical
template was originally proposed by Menn (1978), taken up by Macken (1992,
1995), and later further developed by Vihman and her collaborators (Vihman
and Velleman 1989, 2000; Vihman, Velleman, and McCune 1994; Vihman and
Croft 2007), within the older framework of the whole-word hypothesis
(Ferguson and Farwell 1975; Macken 1979; Menn 1971, 1983; Waterson
1971, 1987). The underlying assumption of this approach, as formulated by
Francescato (1968), is that children never learn sounds, they only learn words
and the sounds are learnt through words (p. 148).
317
Previous analyses of French lead to the conclusion that children construct
their phonology on the basis of a small number of templates, shaping a mini-
lexicon that allows them to progressively develop the relevant phonological
generalizations (Wauquier-Gravelines 2005). Templates can be taken to reect
the formal side of early words.
1.2 Articulatory continuity or typological constraints?
The adoption of the whole-word hypothesis and of a lexical template for the
acquisition of French raises the problem of how to model this template for
French and how to determine the constraints that apply to the production of the
rst observable word forms in French data.
Despite being one of the pioneers in the collection and analysis of early word
production data in various languages (Vihman and Velleman 1989; Vihman
et al. 1994 for English; Vihman 1976, Vihman and Vihman 2011 for Estonian;
Vihman 1993 for French; Vihman and Velleman 2000 for Finnish; Keren-
Portnoy, Majorano, and Vihman 2009 for Italian), Vihman has not emphasized
typological constraints as a determining factor for the templates she describes.
Her initial focus was on establishing articulatory continuity frombabbling to the
rst words, which suggests for every child an individual developmental sce-
nario that is less likely to be inuenced by the target language (Vihman,
Macken, Miller, Simmons, and Miller 1985).
More recently, Vihman has undertaken more systematic cross-linguistic com-
parisons of her data (Vihman and Kunnari 2006) and opened a typological line of
inquiry by showing that the templates are at least partly constrained by the
regularities of the target language (Vihman 2010). After examining some ten or
twelve languages, she identies the following major tendencies (Vihman 2010).
The templates reect a limited number of syllabic structures that never
exceed two vocalic nuclei: CV, VC, CVC, CVCV, CVCVC.
Consonant clusters and structures are generally absent.
The templates are built on the basis of a limited segmental inventory, gen-
erally a subset of the inventory of the target language. This limited inventory
seems to vary from child to child and relies, in part, on articulatory continuity
from babbling to the rst words.
Consonantal variation across the lexical unit is restricted to manner or place
only, not both, with full harmony the most common outcome.
Melodic patterning (or a xed segmental sequence) is also found within
templates, though more rarely: in this case the consonantal sequences may
be specied for place but not for manner.
In the case of melodic patterning, either medial or nal position may be
specied, but not initial position. Recorded segmental specications include
medial glides [j] or [w], medial glottal or uvular fricatives or [l], and nal
coronal, velar, fricative or nasal.
318 Sophie Wauquier and Naomi Yamaguchi
Vowel melodies include <lowhigh> (but not the reverse), diphthongal
specication (<Vi>, <Vu> or both) and nal vowel specication (often [i]).
These tendencies are also reected in the French data presented here, but
systematic ambient-language-based contrasts with English, for example
are also evident. In particular, very few CVC structures are found in French
templates compared to data from English, Dutch, and Estonian children (Elbers
and Ton 1985; Fikkert 1994; Vihman 1976; Vihman and Velleman 1989;
Vihman and Vihman 2011) or from bilingual English/French children
(Brulard and Carr 2001). While CVC sequences are common enough in high-
frequency words typically addressed to children in French (e.g., poule hen,
vache cow, robe dress, jambe leg), the number of words of this type that
children attempt to say may be reduced due to resyllabication in continuous
oral speech (Adda-Decker, Boula de Mareil, Adda, and Lemel 2005), as
shown in (1).
(1)
la vache [la /va] => la vache est au pr [la/va/e/to/pe]
the cow the cow is in the pasture
This suggests that the rhythm of French and its strong tendency for a CV-CV-
CV syllabication a tendency which leads, in particular, to fairly systematic
resyllabication of the nal coda of a word and its attachment to the next word
shapes the word forms produced by French-speaking children, who quite
consistently avoid producing codas. As will be shown below (Section 2),
French offers a typologically unique accentual, metrical, and prosodic structure,
although its segmental phonology is not particularly complex, despite its
marked vowel inventory (Carvalho, Nguyen, and Wauquier 2010). If typolog-
ical constraints guide the formation of initial word forms in French children, this
could be expected to be more evident on the prosodic and rhythmic than on the
segmental level. Accordingly, we hypothesize that CV-CV-CV syllabication
of the input will constrain the templates produced by French children.
This brings us to another aspect of French that requires attention in the
typological denition of templates. This is the fact that in French, unlike
English, for example, common nouns seldom occur without a determiner
(Veneziano and Sinclair 2000; Bassano, Maillochon, and Mottet 2008).
Consequently, children are exposed to an input in which bare nouns are rarely
heard, so we can expect childrens prosodic templates to incorporate unanalyzed
pro-clitic determiners or to show some trace of those determiners.
1.3 Template functionality
The third point that requires examination concerns the purpose and function-
ality of templates in the word production of French-speaking children. The
answer to this question lies in our approach to the role of the template at the
interface between perception and production.
Templates in French 319
Experiments with infants show that at around 911 months, an age generally
corresponding to the transition from canonical to variegated babbling, children
can recognize the major prosodic boundaries (Hirsh-Pasek et al. 1987; Jusczyk
1992; Gerken 1994) and the accentual patterns of their language (Jusczyk, Cutler,
and Redanz 1993), and have a holistic representation of lexical labels, which are
undoubtedly underspecied phonologically as well as morphosyntactically
(Hall and Boysson-Bardies 1994). Many perception studies have shown, for
example, that children identify function words at an early stage and essentially
use them, in association with other salient information, to identify word bounda-
ries and segment the speech signal into blocks (see Echols and Marti 2004; Hall,
Durand, and Boysson-Bardies 2008). Moreover, Hall and Boysson-Bardies
(1994, 1996) conducted experiments to identify the age of familiar word form
recognition. Children aged 11 months were presented with phonotactically
matched lists of common and rare words in a headturn preference procedure.
The results suggest that at this age there is no analysis or phonological decom-
position of lexical units, which are stored either globally or underspecied
particularly as regards the unaccented syllable (Vihman, Nakai, DePaolis, and
Hall 2004). Finally, other studies have demonstrated childrens difculty, even a
few months later, in word recognition and lexical processing tasks (for a detailed
review, see Fisher, Church, and Chambers 2004), particularly as concerns dis-
tinguishing between newly learned minimal pairs, which requires attention to
phonological detail (Barton 1978; Stager and Werker 1997).
We also nowknowthat infant speech segmentation is strongly constrained by
the rhythm of the target language (Ramus, Nespor, and Mehler 1999), and that
speech rhythm is one of the rst linguistic properties that infants employ to
distinguish languages (Nazzi, Jusczyk, and Johnson 2000). On this basis, then,
the templates seem to be global unanalyzed forms that provide a formal shape
for constructions in the sense of units with a formmeaning link. They can be
seen as a way for children to deal with the temporal organization of speech in
production: they constitute, at the phonological level of development, a tempo-
rary structural response to the metrical structure, the syllabic organization, the
rhythm and the stress/accent patterns of input speech. In that sense, the tem-
plates reect in production the units perceived in the input at a very young age
(before 8 months). Thus, we can consider the templates as functionally emer-
gent units whose format is typologically constrained by the input of the ambient
language.
1.4 Outline
We will draw on the three issues that we considered above (format, constraints
and functionality) to support the template conception that we propose for
French. We begin by presenting the main aspects of French prosody, partic-
ularly with a view to countering a common alternative conception that we
consider to be misguided, namely, that French is an iambic language, a mirror
image of English, in which central status is accorded to the binary foot
(Section 2.1). Contrary to this view, we will provide evidence that the accentual
and rhythmic structure of French predicts that the rst unit of acquisition is a
at prosodic template (Section 2.2). Having proposed a model for such a
prosodic template for French (Section 2.3), we will show how data for the
early period of word production, to which these premises apply, partially
conrm the proposed theoretical model (Section 3), while data obtained at
later stages clearly demonstrate the use of this prosodic template as well as
other templatic phenomena (spreading, planar segregation) which can be
seen as effective acquisition strategies for a syllabic CV-CV language like
French.
The data that we provide to illustrate this analysis come primarily from three
sources: (a) a corpus whose collection and analysis was nanced by the ESRC
project Psychological signicance of production templates in phonological
and lexical advance: A cross-linguistic study (the PSPT Project), made up of
longitudinal data from six children aged 1729 months, (b) longitudinal data
from the Claire corpus (Wauquier-Gravelines 2005), and (c) a corpus of elicited
and semi-elicited production from 38 3- to 5-year-old French children (Braud
1998, 2003).
2. Templates in French: the promise of the input
2.1 What French is not and what it cannot do
A good deal of acquisition research has been carried out within the framework
of prosodic phonology (Selkirk 1984; Nespor and Vogel 1986; Fikkert 1994), in
which a universal hierarchy is assumed to govern the organization of prosodic
constituents, regardless of language, such that all levels are obligatorily repre-
sented and nested according to binary logic.
[2]
Phonological Phrase (PPh)
Prosodic Word (PWd)
Foot (Ft)
Syllable ()
Mora ()
This perspective leads to an empirically inadequate analysis of French,
however. Indeed, the initial assumption within this approach is that acquisition
is achieved in any language through the production of progressively more
complex units based on binary metrical feet (trochaic in English and other
Germanic languages). Thus, a strictly linear order of acquisition is assumed,
which should vary little from language to language. Moreover, this analysis
assumes that all languages have binary feet and that children necessarily go
through a stage that includes the calculation of a lexical stress (Fikkert 1994;
Hayes 1995; Demuth 2001; Demuth and Fee 1995; Rose 2000; Goad and
Buckley 2006; Goad and Prvost 2008).
Until Rose (2000), this hypothesis was maintained for Germanic languages
almost exclusively, leading to the formulation of the trochaic bias hypothesis
(Allen and Hawkins 1978, 1980). Gerken (1994), on English, and Fikkert
(1994), on Dutch, both assume that children focus on strongweak metrical
structure in constructing their rst lexical units and perform truncation oper-
ations accordingly, while maintaining that the binary foot may be parametrized
across languages. Since by no means all languages follow the pattern of lexical
stress languages using a trochaic meter, this assumption was subsequently
treated within the Principles and Parameters framework, which allows for
(binary) alternative paths into language. Hayes (1995) proposes that all children
use a binary foot, the head of which is parametrized according to contact input
(that is, initial patterns in any target language will be bisyllabic, either trochaic
strongweak or iambic weakstrong), and that the default setting for this
parameter is a trochaic foot. Finally, Rose (2000), Dos Santos (2007), Goad and
Buckley (2006), and Demuth and Tremblay (2008) have proposed that the
acquisition of French is a mirror image of the process in English, and that
children go through an iambic foot stage, centering on the last two syllables of
the units they produce according to the weakstrong pattern.
These proposals appear to implicitly consider French as a language with
word-nal lexical stress. Very little consideration is given to the prosodic
and metrical structure of French, despite the availability of good
descriptive accounts (Fonagy 1980; Verluyten 1982; Dell 1985; Di Cristo
1999). These accounts clearly show that French is not an iambic language
with lexical stress, for one good reason: unlike English, French uses the
phrase rather than the word as its accentual unit (Dell 1985; Fnagy 1980;
Di Cristo 1999).
2.1.1 Stress in French Functional categories in French do not carry stress,
while stress in lexical categories (nouns, verbs, adjectives, adverbs) always falls on
the last full syllable, but only when they are produced in isolation:
(3) Monosyllabic words: lait milk, cur heart, vache cow
(4) Bisyllabic words: cheval horse, maison house, voiture car
(5) Tri/quadrisylabic words: crocodile crocodile, lphant elephant, balanoire
swing, hippopotame hippopotamus
In continuous speech, stress placement varies according to the position of the
carrying word within a larger constituent, a syntactic, semantic or phonological
phrase (usually considered as a breath group):
(6) Marie aime son cheval Mary loves her horse
(7) Marie aime son cheval fou Mary loves her crazy horse
(8) Marie aime son cheval fou et orgueilleux Mary loves her crazy and arrogant
horse
(9) Marie et son cheval / traversent la fort au galop Mary and her horse gallop across
the forest
These examples demonstrate that stress placement in French is determined not
on the level of the accented word itself, but higher up, on the level of the larger
phrase or utterance that the word is part of. The term iambic language is
therefore inappropriate for French, given that a stressed constituent-nal
syllable does not necessarily correspond to the heavy syllable of an iamb,
which is a metrical unit. While cheval (6) can be considered an iamb, val fou
(7) cannot.
2.1.2 The foot and the syllable in French metrics The hypothesis that
French is an iambic language is clearly invalidated by metrical structure. In
classical meter, the iamb is dened as a metrical grouping of two syllables, the
rst of which is weak and the second strong. This termhas been extended to all
forms of metrical structure that contain two syllables in the wS pattern (weak
strong). Thus, iambic verse is composed of a number of iambs, that is, binary
feet whose second syllable is stressed or lengthened. A case in point is the
famous iambic pentameter, the meter frequently associated with the poetry of
English and German, both lexical-stress languages. In French, which is
considered a syllable-timed language, the verse is divided not into feet,
grouping long and short or stressed and unstressed syllables, but into syllables
of the same metrical value (Verluyten 1989). Verlaines poem Green (10),
for example, is written in the classical form of alexandrine verses of twelve
syllables, all having the same metrical value. A caesura at the hemistich
boundary after the sixth syllable (the half point of the verse, producing a 6/6
structure) creates a division into two sets of six syllables with identical
metrical structure. In this case, the nucleus of the sixth syllable (i.e., eur
[], cur [k], pas [pa], beaux [bo]) provides a phrasal accent that
announces the caesura. However, the 6/6 caesura is not essential to the under-
standing of verses that can be segmented into another structure. One could
recite these verses without a break, producing a single twelve-syllable con-
stituent, or a single phonological phrase. This is evident in the third line,
which allows for liaison or linking of [z] across the caesura.
(10) Voici des fruits, des eurs, des feuilles et des branches
Et puis voici mon cur qui ne bat que pour vous
Ne le dchirez pas avec vos deux mains blanches
Et qu vos yeux si beaux lhumble prsent soit doux
1
(Verlaine, Green, Romances sans paroles, 1874)
[vwa/si/de/fi/de/ // de/f/j/ze/de/b
e/pi/vwa/si/m
/k // ki/ n/ba/k/pu/vu
n/l/de/i/e/pa // [z]a/vk/vo/d/m
/bl/
e/ka/vo/zj/si/bo // l/bl/pe/z/swa/du]
It has been argued that there is no organizational hierarchy or intermediary
structure between the syllables and the large prosodic constituents (the six-
syllable hemistich or even the twelve-syllable verse) and that the hemistich or
the alexandrine are at structures, not hierarchical constructions based on a
binary-branching prosodic structure (Verluyten 1989). If the foot is taken to be a
necessary unit, the alexandrine could be analyzed as having twelve feet, but
these would be single feet, that is, twelve syllables.
In short, the fact that French stress placement is phrase-nal rather than word-
nal, as shown above, supports the claim that there is no intermediary structure
between the syllable level and the phrase. In most cases French employs unitary
feet
2
(i.e., syllables of equal value in a non-hierarchical constituent on the foot
level: Verluyten 1982, 1989; Dell 1985), suggesting that these are attached to
larger constituents in a at structure.
But however questionable the postulation of an iambic foot as a metrical unit in
French may be, French does have, as we sawin (6)(9), a nal prominence that has
been well documented (Fnagy 1980; Dell 1985; Di Cristo 1999; Jun and
Fougeron 2000). Should this nal prominence be interpreted as the instantiation
of a (w)S nal foot (an iambic foot)? Some French linguists have indeed interpreted
it as reecting a stressed syllable that would be the strong position of a (w)S foot
(Charette 1991), but a good deal of evidence supports an interpretation of the
French nal prominence as being a domain edge marker belonging to the intona-
tion system in a language without feet (Verluyten 1982; Jun and Fougeron 2000).
Despite this debate, the position that French has an iambic foot is generally
taken for granted in acquisition studies, often without further discussion (but see
Goad and Buckley 2006, Goad and Prvost 2008, Goad 2011). The insistence
on (i) analyzing French as having iambic structures and (ii) considering that
French children will consequently systematically produce iambic feet at the
early stages (e.g., Demuth and Tremblay 2008) appears to be based on a
systematic theoretical bias in favor of the universality of the prosodic hierarchy,
although the empirical facts of French must be taken to present important
challenges to the theory.
2.2 What French is and what it can do
It has also been proposed that French metrical structure is based not only on
phrase-nal stress but also on a phrase-initial counter-stress, symmetrical to the
nal stress. The hypothesized phrase-initial stress was originally proposed by
Fnagy (1980), and later adopted and developed primarily by Di Cristo (1999),
who provides the following description: [French exhibits] a tendency to
accentuate the rst syllable of words, which gives rise to the formation of
barytone patterns and accentual arcs in which only the initial and nal syllables
of a phrase are stressed (my translation).
3
According to Di Cristo, the existence of this initial stress in contemporary
French is accepted by most prosodists, however they conceptualize the phe-
nomenon. Differences relate to the exact interpretation of the counter-stress
(variously regarded as emphatic, an echo or secondary stress). For Di Cristo,
both initial and nal syllables are, therefore, prosodically strong positions,
forming the two pillars of an accentual arc within which the metrically equiv-
alent internal syllables are inserted and eventually reduced relative to the edges
of the constituent. Examples (11)(14) are extensions of (6)(9) with the
addition of Di Cristos proposed counter-stress.
(11) Marie aime son cheval Mary loves her horse
(12) Marie aime son cheval fou Mary loves her crazy horse
(13) Marie aime son cheval fou et orgueilleux Mary loves her crazy and arrogant horse
(14) Marie et son cheval / traversent la fort au galop Mary and her horse gallop across
the forest
2.3 Templates in French: what is the appropriate model for acquisition?
Based on the accentual arc model we may conclude, following Macken (1995),
that the units available to children at the production/ perception interface can be
schematized as in (15):
(15) [ ()
n
]
This formal structure is initially dened by the constraints that produce the
accentual arc structure and that are heard in the French input. It is bounded by
two demarcating stresses that correspond to the stress and counter-stress
described above: the last syllable () carries the demarcating phrasal stress
that delimits the right edge of the unit, while the rst syllable () bears the
counter-stress. The initial and nal syllables dene the boundaries of the
accentual arc, which thus serve as prosodically strong positions. This structure
can be expected to provide the rst lexical pattern for phonological
development in French. It derives from a prosodic unit that is perceptually
available, bounded by stress and counter-stress, and therefore segmentable in
the input. Therefore, we expect that it is this abstract phonological structure
that is targeted by the production templates that we observe and see as
temporary structural responses to the prosodic characteristics of the input.
From this we can develop the following predictions: (i) children will rst
construct the strong syllables, (ii) these will never be truncated, and (iii) will
undergo little deformation. Between these prosodically strong boundaries there
are an open number (n) of intermediary syllables, where n can theoretically
contain any number from 0 to innity. We need to consider that n may be 0: the
internal syllable is generally optional at the early stages (see Section 1.2). Braud
(2003) shows that n is consistently 2 in early production, up to the age of 2.
Unlike the rst and second syllables, the n-site should be less stable and more
variable. We also postulate that this prosodic structure will constrain the
templates produced at an early stage and will later be the domain of morpho-
phonological generalizations.
3. Data and observed template formats
We turn nowto a comparison of this schematization of the rhythmic template for
French with the evidence provided by three sets of French child data. We focus
on the question as to whether or not French childrens early word forms exhibit
the kind of systematic patterning that would reect the typological constraints
given by the rhythmic structure of French as we have described it.
3.1 The PSPT project data: six children at the early stage
Case studies have been prominent in the literature on child phonology, but they
do not provide the best way to test claims about the typological systematicity of
data; it is preferable to compare data from several children of the same age. We
rely here on data obtained and analyzed as part of an ESRC project,
Psychological Signicance of Production Templates in Phonological and
Lexical Advance: A cross-linguistic study (the PSPT project), which includes
longitudinal data from 6 French children aged 1729 months. These children (2
girls and 4 boys) were recorded once a month in 30-minute sessions of natural-
istic, non-elicited interaction with a parent. Recordings began at the 25-word
point
4
(based on a parental questionnaire and researcher verication in an initial
control recording). The data were transcribed and analyzed using PHON (Rose
and MacWhinney, 2013). Here we present our ndings based on the rst ve
recording sessions, which will enable us to observe the prosodic structure of the
rst words.
3.1.1 The idiosyncratic character of the rst templates We begin by noting
that the data conrm the tendencies identied by Vihman (2010) (see
Section 1.2), at least in part. A size constraint on these word forms is clearly
evident (they never exceed two vocalic nuclei). These data also conrm the
great variability of word forms from one child to the next and the existence of
idiosyncratic strategies that suggest individual articulatory continuity with
babbling and the personal preferences of each child.
To illustrate, Table 11.1 presents the 28 occurrences of the word micro
microphone produced by Bryl during a single recording session (19 mos.).
The data reveal Bryls strong preference for the form [ao] (13 out of 28
tokens). But beyond that, all tokens realize the VCV pattern, with a vocalic
a-o melody and a xed medial consonant consisting of a uvular fricative (14 out
of 28 tokens). The remaining patterns are roughly modeled on the main pattern,
with two systematic changes: either to the vocalic melody (o-o or a -o alternate
with a-o), or to the medial consonant (the uvular fricative alternates or (in one
case) combines with the velar stop [k]). Table 11.2 shows that this pattern was
also extended to other target words that were selected (in gray, here and else-
where) or adapted
5
to t this pattern (same session).
At rst glance, this result seems to follow an idiosyncratic articulatory logic in
the construction of a template, to the extent that the chosen structure is clearly
specic to this child whereas uvular fricatives in medial position are not particularly
characteristic of French. Thus, the medial /kr/ cluster of micro, with its uvular [r],
must be supposed to have inspired Bryls rough phonetic approximation.
3.1.2 Typological characteristics of the rst templates Yet Bryls tem-
plate is perhaps less idiosyncratic than it appears, if analyzed on the
syllabic level and compared to Vincents data (at 17 mos.), for example
(Table 11.3).
In fact, Vincents data also reveal a VCV structure, where the rst vowel is
frequently central (schwa or a). This suggests that Bryls apparently idiosyncratic
Table 11.1. Bryls 28 tokens of micro microphone (at 19
months)
Main patterns N tokens
Other patterns (one
token each)
[ao] 13 [akpo]
[oo] 2 [pako]
[o] 2 [ako]
[oo] 2 [koko]
[tao] 1 [ahko]
[oko] 1 [ak
h
o]
[o]
Table 11.2. Bryls other word forms reecting the
< aCo > template
Target words Bryls productions
agneau [ao] lamb [alo]
bateau [bato] ship [ato]
poisson [pwas
] sh
[ao]
lphant [elef] elephant [afo]
crapaud [kapo] toad [ako]
VCV template corresponds to a prosodic structure available to other French
children. This hypothesis is conrmed by the data summarized in Figure 11.1.
Figure 11.1 shows the percentages of word forms produced by the six
children over the course of their rst ve recording sessions, sorted according
to the output syllabic structure. The rst structures produced are mainly CV for
all six children (the most frequent production form: 43 percent on average). The
CV syllable may derive fully or partially from the nal syllable of the target
word (e.g., [k] < [k], encore again), less frequently from the initial
syllable ([k
] < [kana], canard duck), or from segmental reorganization
based on both the initial and the nal syllable of the target word ([bu]< [bizu],
bisou kiss). This CV syllable is not always the one receiving nal stress in the
adult target (and can therefore still less be characterized as the strong syllable
of an iambic foot).
However, we also note that CVCV and VCV structures are systematically
produced at an early stage alongside the CV pattern (unlike the remaining
structures, none of which accounts for more than 5 percent of the tokens
produced). The CVCV structure is primarily produced in the case of targets
that are reduplicated in the adult language (doudou security blanket, papa
Table 11.3. Vincents word forms, the <VCV> template
Target words Vincents production
allo [alo] hello (on telephone) [alo]
OK [oke] [oke]
attends [at] wait [at]
ici [isi] here [ii]
bravo [bavo] bravo [avo]
voil [vwala] there it is, there you are [ala]
avion [avj
] airplane
[aj
]
ferm [fme] closed [ame]
cach [kae] hidden [ae]
encore [k] again [at]
50%
40%
30%
20%
10%
0%
CV CVCV VCV CCV V CVC (C) VCCV VC
Figure 11.1. Percentages of early syllable structures of word-forms (averaged
over ve sessions for six children)
daddy, maman mommy), but for non-reduplicated targets (lapin rabbit,
chapeau hat) the structure of the child form is more likely to be VCV. The
VCV structure is obtained either through selection (allo hello, attends wait)
or adaptation (CVCV > VCV: lapin, chapeau, ferm closed). The tendency
to favor some words over others can also be observed longitudinally
(Figure 11.2). The CV structure arises early as the preferred word form and
remains the most used structure throughout the ve sessions for all the
children. Despite individual variation from one child to the next (Figure 11.3),
the preferred structures include CV, CVCV, and VCV for all but one child.
The children can be ranked from Bastien, who primarily used the CV structure
and secondarily the CVCV and VCV structures, to Bryl, who made equal
use of the CV, CVCV, and VCV structures. In all the cases the CV structure is
used early and remains the most used by all the children from the rst to the fth
session.
The generalization that emerges is that the rst word structure to stabilize
and be frequently produced by French-speaking children is built around a
CV syllable and not a binary iambic foot. This is followed by two options that
are frequently produced, in parallel, to augment and vary the rst CV template:
CV > VCV, where V
1
is mostly a central or front vowel ([a] [] [e] [])
CV > CVCV, where CV is frequently reduplicated. There are also
some cases where the two consonants differ but are harmonized
(mainly for place).
Both of these patterns express a systematic avoidance of consonant change across
the word and a preference for open syllables, word-internally as well as nally.
This means that Bryls [ao] word form, which at rst looks so idiosyncratic, can
be analyzed as the realization of a more general <aCo> template which is itself an
instantiation of a <VCV> pattern reecting one of the main typological
characteristics of French (i.e., open syllabication). And, as we saw,
1000
800
600
400
200
0
N
u
m
b
e
r

o
f

o
c
c
u
r
r
e
n
c
e
s

(
6

c
h
i
l
d
r
e
n
)
S1 S2 S3 S4 S5
1200
V
VCV
CVCV
CV
Sessions
Figure 11.2. Number of lexical items belonging to each word form: CV,
CVCV, VCV, V (averaged over ve sessions for six children)
the<VCV> pattern can itself be analyzed as a variant of the CV pattern (V + CV),
which is by far the most common structure produced by French children.
3.2 Later longitudinal data: truncation and reduplication
Will the typological constraints of the target language have the same effect on
the word forms of older children? We present below data from the longitudinal
Claire corpus, from sessions taken at the age of 2225 months, when she had a
vocabulary level higher than what is reected at the 25-word point (cumulative
lexicon of some 200 words): see Tables 11.4 and 11.5, organized by length of
the adult target. We focus on Claires truncations and reduplications and the way
they may t a prosodic template partly (or fully) expressing the typological
constraints described above (Section 1).
3.2.1 Truncation and templates As illustrated in Table 11.4, one- and two-
syllable words are produced without truncation, mostly with a schwa or [l],
[la], [], which can be interpreted as proto-determiners (Veneziano and Sinclair
2000, see Section 1.2). Note that the proto-determiner does not appear with
proper nouns in the input and is not reected in Claires forms of these either. In
contrast, three- and four-syllable words exhibit partial deletion of segmental
material between the proto-determiner and the last syllable (or even the last
vowel). Thus, Claire preserves the two edges of the target words according to
the proposed template. This is true for long words as well as for monosyllabic
words, for the former at the expense of the internal syllables and for the latter as
well as for monosyllabic words produced with an initial vowel as a proto-
determiner (Veneziano and Sinclair 2000).
Bastien Julien Romuald Vincent
Children
Marie Bryl
CV
CVC
CCV
CVCV
VCV
V
70
%

o
f

p
r
o
d
u
c
t
i
o
n
60
50
40
30
20
10
0
Figure 11.3. Individual variation in the syllable structures of word forms
(averaged over ve sessions)
3.2.2 Truncation, reduplication, and spreading In the same period Claire
developed another strategy for handling three- and four-syllable words, which
provides additional conrmation for this analysis (Table 11.5).
Here Claire lengthens her word forms by reduplicating a syllable of the
truncated word. She appears to proceed in two steps:
1. Truncate left edge of word but preserve determiner;
2. Lengthen the word by reduplicating the left-most syllable.
For example, for le chocolat [lokola] Claire begins by producing [ekola],
followed by the string [ekokola], produced by reduplication of the syllable [ko];
for un crocodile [
kokodil] she begins by producing [
koti], then omits the

beginning of the word to obtain the number of syllables of the target [ekokodi]
by reduplicating the penultimate syllable [ko]. Reduplication can be seen here
as motivated by the presence of very similar syllables in the adult target word
([ko] and [ko] in [
kokodil]).
In fact, Claires production aims at the template proposed in (15) and
elaborated on the basis of Di Cristos concept of the accentual arc. This indicates
Table 11.4. Claires word forms (2223 mos.)
Monosyllabic target words Child forms
la vache [lava] the cow [ja] / [laja]
lne [lan] the donkey [ltan]
6
le pot [lpo] the pot [lpo]
le chien [lj
] the dog [lt
]
Claire [kl] [l]
Bisyllabic target words Child forms
le bb [lbebe] the baby [lbebe]
un ballon [
/dbal
] one balloon [abal
]
deux ballons [dbal
] two balloons [dbal
]
Didou [didu] [didu] [tidu]
Maman [mam] Mummy [mam]
Trisyllabic target words Child forms
lphant [elef] elephant [e]
un lphant [
nelef] one elephant [ej]

deux lphants [dzelef] two elephants [de]
le hrisson [leris
] the hedgehog [lij
]
Aurlien [orelj
] [j
]
Olivier [olivje] [oje]
Quadrisyllabic target words Child forms
la brosse dents [labsad] the toothbrush [anad]
un mdicament [
medikam] a medicine [apam]

that in target utterances longer than two vocalic nuclei (based on adult targets
longer than CVCV), she preserves the initial vowel (the proto-determiner) but
reduces the beginning of the content word and shapes a prosodic template with
an accented nal syllable [], a counter-stressed rst syllable [] and internal
syllables with n < 2.
In order to evaluate this interpretation we now turn to a larger data set from
children at a still more advanced lexical level (age 2.55 years). If the French
children are aiming at a prosodic shape that can be formalized as in (15), we
should be able to observe the same patterns as their utterances become longer.
3.3 Multisyllabic words and truncations: how do the templates evolve?
We present here data extracted from a comparative corpus of 18 children aged
3036 months (Braud 1998), followed by data from three groups of 20 children
aged 35 years, which was the basis for a systematic study of truncation and
reduplication in French (Braud 2003). To create this database groups of children
ranging in age from 2.5 to 5 years were recorded. Speech was elicited using a
picture-naming task, with word length as the experimental variable. The experi-
ment was presented as a game. Children were presented with a picture and asked:
Quest-ce que tu vois sur cette image? What do you see in this picture? The
following examples were extracted fromthe data of children aged 2.5 and 3.5 years.
In Table 11.6, as in Claires data, reduplication and truncation are carried out
simultaneously and almost exclusively on words of more than two syllables.
As regards monosyllabic words, the few observed cases of reduplication
appear in words that are already lexicalized as reduplicated forms in the adult
input and used as terms of endearment in colloquial French. The children
apparently do not reduplicate monosyllabic words spontaneously. In fact,
Plnat (1984, 1999) has shown that reduplication is systematically used in
French to form diminutives and nicknames with a hypocoristic value (e.g.,
Guiguite for Marguerite, Roro or Bbert for Robert). This is particularly evident
in the case of nounours and nonos bear/teddybear, which may be heard as such
in the input. Consequently, reduplication of short words, which is already
Table 11.5. Claires reduplication patterns based on tri-
and quadrisyllabic targets
Tri- and quadrisyllabic target words Child forms
le chocolat [lokola] the chocolate [ekola]
le chocolat [lokola] the chocolate [ekokola]
un crocodile [
kokodil] a crocodile [
koti]
un crocodile [
kokodil] a crocodile [kukudi]

un crocodile [
kokodil] a crocodile [ekokodi]

provided in the input, can presumably not be considered as a productive process
in French childrens early word forms.
If we now consider three- and four-syllable words, we nd the same phe-
nomena noted in Claires output: omission of syllables is not random but
conditioned by the prosodic structure of the input. The rst and second syllables
of the word are quasi-systematically omitted, while the phrase-initial syllable,
the proto-determiner, and the nal syllable i.e., the two edges of the Di Cristos
accentual arc are consistently preserved.
4. Discussion
Let us now examine howthe three sets of data shed light both on the hypotheses
discussed above and on the predictions we made concerning the way in which
the typological constraints of French the strong tendency for CV-CV syllabi-
cation and early initial ller on content words could be expected to shape the
word forms produced by French-speaking children (see Section 1.2). We will
also discuss the relevance of the templatic model we proposed in Section 2.3 as
a challenge to the iambic-foot approach. We argue that there is neither a need
nor a justication for postulating an iambic foot, and that a rhythmic template
can better cover our three data sets and account for the evolution of the word
forms all through the course of development.
4.1 CV-CV syllabication and early word forms
Many studies comparing the acquisition of Romance vs. Germanic languages
(Fikkert et al. 2004), as well as Vihmans (2010) data, showthat French children
Table 11.6. Truncation and reduplication patterns in older children
Monosyllabic target words Child forms
un ours [
nus] a bear, un nounours

[
nunus] teddy bear

[nunus] /[
nunus]
un os [
ns] a bone [
nons]
Trisyllabic target words Child forms
un arrosoir [
naozwa] a watering-pot [
ozwa] / [
ezwa]
une coccinelle [ynkksinl] a ladybird [ynkokosinl]
Quadrisyllabic Target words Child forms
un accordon [
nakde
] an accordion [
aaj
]
un pouvantail [
nepuvtaj] a scarecrow [
pupuvtaj]
un hlicoptre [
nelikpt] an helicopter [
nenikt] / [
ninikpt]
un aspirateur [
naspiat] a vacuum
cleaner
[
pasat] /[
aat] /[
astat] [
piat] /
[
piat] / [
pisat]
do not tend to arrive at early CVC templates patterns while Dutch, English,
German, and Estonian children often do. Brulard and Carr (2001) also demon-
strate a CVC pattern in their English/French bilingual child. Thus, omission of
codas is not dictated by an age-related or wholly maturational constraint. Nor
does it originate from isolated words in the French input, which provide many
CVC words of high frequency like robe dress, soupe soup, dame lady, coq
rooster, vache cow, which commonly occur in child-directed speech. Why
then do monolingual French children seem to lter the input to avoid CVC
patterns? We assumed that their templatic patterns are inuenced by the rhythm
and the CV-CV syllabication of French (Section 1.2). More generally, one can
assume that the parameters of variation in the childrens surface forms in
production must be at least partially typologically constrained and limited
to the underlying structures supported by the rhythmic and syllabic structure of
the target language.
The strong generalizations that emerge from our data clearly conrm this
assumption. As we have seen, French-speaking children share a preference for
open-syllable structures, early vowel stability in the nuclei and an avoidance of
consonant clusters. The rst word forms to be produced systematically by
French-speaking children are built around the CV word structure.
The childrens word forms evolve in two ways, in parallel, then, to augment
and vary the rst CV template:
CV > VCV, where V
1
is mostly a central or front vowel ([a] [] [e] [])
CV > CVCV, where CV is frequently reduplicated. There are also
some cases where the two consonants differ but are harmonized
(mainly for place).
These regularities correspond to our predictions regarding the typological
constraints that French input imposes on the rst templates. The rhythmic
features of French, a syllabic language that favors CV-CVsyllabication (some-
times even at the expense of word boundaries, as in the case of liaison
sequences: see Section 2.1.2), lead children to construct their initial templates
on the basis of the CV syllable and to prefer open structures.
4.2 Status of the initial ller: does it support the binary foot?
Another aspect of the data is also interesting, although harder to interpret: the
presence of low or mid vowels as V
1
in VCV structures is tricky. For certain
words like [ap
] or [apo] (lapin, chapeau) one is inclined to analyze V

1
as
the rst-syllable vowel of the corresponding target word (see also Veneziano
and Sinclair 2000). This leads one to analyze the adaptation as CVCV > VCV
and to suppose that the child has omitted the onset consonant of the target. In
other cases ([ao], [oo], [o], [oko], [tao] for micro), another analysis is
possible, based on an operation that enables the generation of [a + o], [o + o],
[ + o], [o + ko], [ta + o], which can then be analyzed as [vowel + (velar
obstruent + o)], in which the vowel does not correspond to V
1
in the target but
rather to the vowel of some (unspecied) determiner. Accordingly, micro is
reduced to the nal syllable [o] or [ko], and the rst vowel of each form
becomes analyzable as the trace of a proto-determiner. In these cases, one has to
reconsider Bryls adaptations as CVCV > CV > V + CV, and assume that she
completely truncated the initial syllable (micro is encoded as [o] or [ko]), and
then produced this syllable with an initial vowel that reects the presence of a
determiner in the adult input. This interpretation is corroborated by the fact that
the mothers input for micro was always produced with a determiner (le micro
the microphone, cest le micro its the microphone). This would be true for
any French speaker, since content words are seldom produced without a
determiner in any context. Moreover, Bassano et al. (2008) have proposed
that the systematic presence of prenominal llers in the early stages of French
L1 acquisition can be analyzed as representing determiners or proto-
determiners. Our data conrm the widespread presence of prosodic positions
lled by llers that could represent the proto-determiners. These llers can be
taken to reect another typological characteristic of French adult input: the
systematic expression of determiners before common nouns.
The VCV word forms thus conform to the theoretical template in (15), which
includes the boundaries (the nal syllable on the right edge of the template and the
proto-determiner as the nucleus of the initial syllable on the left edge, with
truncation of the rst syllable of the target word) and a zero value for the n
index. Based on two case studies (Tim and Mary), Demuth and Tremblay (2008)
propose a different analysis for VCV structures. They argue that determiners
appear more quickly and more frequently with monosyllabic words (forming an
iambic foot due to prosodic constraints) than with longer words (bi-, tri- or
quadrisyllabic). This analysis is debatable, however. Indeed, as shown by
Vihman, children exhibit strict limits on the size of the early units produced,
whatever the target language, so that the size of the early units may be constrained
by more general psycholinguistic limitations (such as working memory). But
even if the early presence of CVCVand VCVsequences is taken to be the result of
a prosodic constraint, there is no indication that it must necessarily be due to the
use of a binary branching structure. It is equally possible that children access a at
templatic structure whose prosodic length is restricted and which varies from
child to child. The fact that the rst V vowel in VCV structures can simulta-
neously be interpreted as a full vowel of the internal syllable of the target word
(lapin > [ap
]), or as a clitic proto-determiner (un micro > [ao]), supports this

view (see also Veneziano and Sinclair 2000). This means that the VCV structure
found in child production need not be interpreted as an iambic (w)S foot, which
can in any case account for it only in part. The alternative interpretation proposed
here accounts for the data in a more comprehensive way.
Finally, as Claires output shows (like Maries but unlike Tims, at roughly the
same age: see Demuth and Tremblay 2008), vocalic llers or initial vowels are
present even in bisyllabic and trisyllabic units. It appears, therefore, that much
variability can be observed from one child to the next, as Demuth and Tremblay
themselves note elsewhere. This fact could be attributed to individual differences
in accessing the prosodic hierarchy, as assumed by Demuth and Tremblay.
4.3 From early CVand VCV word forms to the accentual arc
At the later stages, when children reach a point in development that allows for more
than two vocalic nuclei, the limits of a prosodic template can be set to allow
development and to include more internal syllables. We can now consider the
following developmental scenario, progressing from CVand/or VCV patterns by
the adjunction of internal CVstructures: the children use the same pattern to expand
the template with internal syllables, particularly through reduplication of the penul-
timate syllable, along with gradual diversication of the segmental material. We
express this proposal formally in (16), which reects the earlier stages (with no more
than one internal syllable, n = 1) and (17), which includes later stages as well (n >1).
(16) CV > VCV
> V(CV)
1
CV
CV > CVCV
9
=
;
(17) CV > VCV > V(CV)
1
CV > V(CV)
2
CV > V(CV)
3
CV
Moreover, lengthening and segmental diversication do not appear to occur at the
same time. Children seem to rst lengthen the initial structure through the
addition of syllabic positions that they ll with a reduced set of the consonants
and vowels to be found in the adult target; they then diversify at a later stage. For
example, un aspirateur a vacuum cleaner ([
naspiat]: see Table 11.6), a

quadrisyllabic word, is rst produced with three syllables, reproducing a subset of
the vowels and consonants of the target ([
astat]: [
, a, ], on the one hand,

and [s, t, ], on the other). The word is then lengthened and segmentally
diversied ([
pisat]: [
, i, a, ] and [p, s, , t, ]) until the adult target

([
naspiat] is obtained: [
, a, i, a, ] and [n, s, p, , t, ]). The childs lexical

representations may be regarded as prosodically conditioned templates: chocolat
is the result of the redistribution of two, then three consonants and two vowels [o],
[a], based on a progressively elaborated template with two basic strong positions,
marked by the nal stress on the right edge and a counter-stress on the left edge.
The proposed sequence for the word chocolat is illustrated in (18)(20).
(18) [ eko la]: V
1
CVCV
k l
V V C C V
e o a
(19) [ ekoko la]: lengthening
k l
C V C V C V C V]
e o a
(20) [ eoko la]: diversification
k l
C V C V C V C V
e o a
This representation assumes that in childrens early production consonants
and vowels are stored on two separate tiers (planar segregation), as proposed
by Menn (1978) and Macken (1995), within the Autosegmental Phonology
framework.
7
This makes it possible to formally express the autosegmental
dimension of childrens productions as regularly observed in phonological
development (consonant/vowel dissociation, reduplication, onset/nucleus dis-
sociation, spreading, harmonies).
5. Conclusion
This chapter has examined in more detail a hypothesis previously proposed for
French (Braud and Wauquier-Gravelines 2004; Wauquier-Gravelines 2005),
offering a formal, rhythmically determined template that subsumes a wide
variety of structures observable in child output, grounded in a rich empirical
base. The approach rests on the premise that childrens rst units are condi-
tioned by systematically expressed typological constraints in addition to indi-
vidual and idiosyncratic constraints. The idea is not to reduce the variety of all
observable facts to a single referential template that would serve as a unique
underlying representation for all child forms, but rather to propose a formal-
ization of a prosodic template that predicts the development of the prosodic
structure of French.
We have shown that the analysis of French prosody invites us to discard the
premise that the L1 phonological acquisition of French requires the postulation
of early iambic units. We have proposed an alternative theory, based on Di
Cristos (1999) concept of the accentual arc, according to which French-
speaking childrens early word forms reect this structure through a template
that consists primarily of the two pillars of the arc, a pattern which later evolves
by the addition of syllabic structures in medial positions of the template.
The proposed developmental scenario assumes a at, nonbranching structure
into which children gradually add CVunits corresponding to the syllabication
pattern commonly present in French input speech, which favors open structures
even at the expense of word boundaries. It is argued that the basic unit employed
by French children is the CV syllable rather than the binary foot. Our three sets
of data trace a developmental course from early word forms (PSPT project) to
later ones (Claire and Braud corpora) and demonstrate that children begin with
open CVand VCV structures and deploy in parallel planar segregation between
consonantal and vocalic melodies to progress by the addition of internal CV
structures towards the adult target shape of the words. Finally, the proposed
template and developmental scenario allow us to account for the presence of
early determiners with nominal units in child output, reecting another typo-
logical characteristic of French input.
To conclude, we can assume that the universal early sensibility to rhythmwill
be reected in production and that the shape of the early word forms produced
will also be guided by this sensitivity to speech rhythm and speech timing.
Accordingly, the templates are output forms that can be considered as temporary
structural and typologically constrained responses to the temporal organization
of speech with respect to its accentual pattern and the salient rhythmic and
segmental features of the target language.
notes
1. Here are fruits, owers, leaves, and branches
And here is my heart, which beats only for you.
Do not tear it apart with your two white hands
But may the humble gift, to your lovely eyes, seem sweet. [trans M. Vihman]
2. This has also been argued by Selkirk (1978), who described French as having single
feet, while supporting a trochaic (rather than iambic) binary foot to account for the
French schwa/0 alternation (renard fox [na] can be produced [na] in some
variants of French). This rule, postulating a trochaic binary foot in French, has
garnered much criticism (Tranel 1987) and alternative solutions have been proposed
to account for the schwa/0 alternation (for an overview, see Scheer 2004). A signi-
cant point of criticism is that the trochaic foot rule assumes that the foot makes an ad
hoc appearance solely to resolve the schwa/0 alternation, and is based on no accentual
parameter; in fact, in most contexts its predication conicts with French accent
structure.
3. [il y a en franais] une tendance accentuer la syllabe initiale des mots, ce qui donne
naissance la formation de schmes barytoniques et darcs accentuels dans lesquels
seules les syllabes initiales et nales dun groupe de mots reoivent un accent
(Di Cristo 1999: 185).
4. The recording session in which the child rst spontaneously produces 25 or more
different word types in 30 minutes.
5. Selected words appear to have been chosen by the child for their t with the
template (i.e., agneau sheep [alo] ts the <aCo> template, although with substitu-
tion of [l] for //). The adapted words have been modied by the child to t the same
template (i.e., lphant elephant [elef] is modied to [afo]). Childrens selected
words are close to the adults target form, within the constraints of the childs
production skills, while the adapted words may be considerably modied.
6. The [t] in the form produced is provided by a commonly occurring phonological
context : the vowel-initial word ne is often preceded by a word ending in a liaison
consonant (i.e. petit [pti] + ne [an] is pronounced [ptitan] little donkey); the child
has likely missegmented such cases, representing ne as *tne.
7. This phenomenon is also evident in the acquisition of the syllabic onset in French
(Wauquier 2010).
References
Adda-Decker, M., Boula de Mareil, P., Adda, G., and Lamel, L. (2005). Investigating
syllabic structures and their variation in spontaneous French. Speech
Allen, G. and Hawkins, S. (1978). The development of phonological rhythm. In A. Bell
and J. Hooper-Bybee (eds.), Syllables and segments, pp. 17385. Amsterdam:
North-Holland.
(1980). Phonological rhythm: denition and development. In G. Yeni-Komshian,
C. Kavanagh, and C. Ferguson (eds.), Child phonology, vol. 1: Production,
Barton, D. 1978. The role of perception in the acquisition of phonology. Bloomington:
Indiana University Linguistics Club.
Bassano, D., Maillochon, I., and Mottet, S. (2008). Noun grammaticalization and
determiner use in French childrens speech: a gradual development with prosodic
and lexical inuence. Journal of Child Language, 35(2), 40338.
Braud, V. (1998). Acquisition de linformation phonologique: exemple de la liaison.
Unpublished MA thesis, Universit de Nantes.
(2003). Acquisition de la prosodie chez les enfants francophones. Les phnomnes de
troncations. Unpublished PhD dissertation, Universit de Nantes.
Braud, V. and Wauquier-Gravelines, S. (2004). Approche gabaritique des phnomnes
de troncation du franais. Actes des journes dtudes sur la Parole, Fez.
Brulard, I. and Carr, P. (2001). Consonant substitution in a bilingual child: Universal
Grammar vs. production templates and strategies. Paper presented at the 3rd
International Symposium on Bilingualism, Bristol, April.
Carvalho, J., Nguyen, N., and Wauquier, S. (2010). Comprendre la phonologie. Paris:
Presses Universitaires de France.
Charette, M. (1991). Conditions of phonological government. Cambridge University Press.
Dell, F. (1985). Les rgles et les sons. Paris: Hermann.
Demuth, K. (2001). Prosodic constraints on morphological development. In
J. Weissenborn and B. Hhle (eds.), Approaches to bootstrapping, pp. 321,
Amsterdam and Philadelphia: John Benjamins.
Demuth, K. and Fee, E. J. (1995). Minimal prosodic word in early phonological words.
Ms., Brown University and Dalhousie University
Demuth, K. and Tremblay, A. (2008). Prosodically-conditioned variability in childrens
production of French determiners. Journal of Child Language, 35(1), 99127.
Di Cristo, A. (1999). Le cadre accentuel du franais contemporain: essai de modlisation
partiel. Langues, 2(3), 184205.
Dos Santos, C. (2007). Dveloppement phonologique en franais langue maternelle:
une tude de cas. Unpublished PhD dissertation, Lumire Lyon 2 University,
Lyon.
Echols, C. H. and Marti, C. N. (2004). The identication of words and their meanings:
from perceptual biases to specic language-cues. In G. Hall and S. R. Waxman
(eds.), Weaving a lexicon, pp. 4179. Cambridge, MA: MIT Press,
Elbers, L. and Ton, J. (1985). Play pen monologues: the interplay of words and babbles in
the rst words period. Journal of Child Language, 12, 55165.
Ferguson, C. A. and Farwell, C. B. (1975). Words and sounds in early language
Acquisition. Language, 51, 41939. Reprinted in this volume as chapter 4.
Fikkert, P., Freitas, M. J., Grijzenhout, J., Levelt, C., and Wauquier S. (2004). Syllabic
markedness, segmental markedness, rhythm and acquisition. Paper presented at
GLOW Phonology Workshop, April 18.
Fisher, C., Church, B., and Chambers, K. (2004). Learning to identify spoken words. In
D. G. Hall and S. R. Waxman (eds.), Weaving a lexicon, pp. 341. Cambridge, MA:
MIT Press.
Fnagy, I. (1980). Laccent en franais, accent probabilitaire: dynamique dun change-
ment prosodique. In I. Fnagy and L. Lon (eds.), Laccent en franais contempo-
rain, special issue of Studia Phonetica, 15, 12333.
Francescato, G. (1968). On the role of the word in rst language acquisition. Lingua, 21,
14453.
Gerken, L. A. (1994). Young childrens representation of prosodic phonology: evidence
from English speakers weak syllables productions. Journal of Memory and
Language, 33, 1938
Goad, H. (2011). Puzzling input and the role of markedness: the acquisition of Qubec
French stress. Paper presented at the International Workshop on Metrics, Phonology
and Acquisition, University of Paris 8, June.
Goad, H. and Buckley, M. (2006). Prosodic structure in child French: evidence for the
foot. Catalan Journal of Linguistics, 5, 10942. Special issue on the acquisition of
Romance languages as rst languages.
Goad, H. and Prvost A.-M. (2008). Is there a foot in L1 French? The competing roles of
markedness and ambient input. Paper presented at the Linguistic Symposium on
Romance Languages (LSRL) 38, University of Illinois, Urbana-Champaign, April.
Hall, P. and Boysson-Bardies de, B. (1994). Emergence of an early receptive lexicon:
infants recognition of words. Infant Behavior and Development, 17, 11929.
(1996). The format of representation of recognized words in infants early receptive
lexicon. Infant Behavior and Development, 19 46583.
Hall, P., Durand, C., and Boysson-Bardies de, B. (2008). Do 11-month-old French
infants process articles? Language and Speech, 51, 2344.
Hayes, B. (1995). Metrical stress theory: principles and case studies. University of
Chicago Press.
Hirsh-Pasek, K., Kemler-Nelson, D. G., Jusczyk, P.W., Wright-Cassidy, K., Druss, B.,
and Kennedy, L. (1987). Clauses are perceptual units for young infants. Cognition,
26, 26986.
Jusczyk, P. W., Hirsh-Pasek, K., Kemler Nelson, D. G., Kennedy, L., Woodward, A., and
Piwoz, J. (1992). Perceptions of acoustic correlates of major phrasal units by young
infants. Cognitive Psychology, 24, 25293.
Jusczyk, P. W., Cutler, A., and Redanz, N. J. (1993). Infants preference for
the predominant stress patterns of English words. Child Development, 64, 67587.
Jun, S. A. and Fougeron, C. (2000). A phonological model of French intonation. In
A. Botinis (ed.), Intonation: analysis, modeling and technology, pp. 20942.
Dordrecht: Kluwer.
Keren-Portnoy, T., Majorano, M., and Vihman, M. M. (2009). From phonetics to
phonology: the emergence of rst words in Italian. Journal of Child Language,
36, 23567.
(eds.), Phonological development, pp. 24973. Timonium, MD: York Press.
(1995). Phonological acquisition. In J. Goldsmith (ed.), The handbook of phonolog-
ical theory, pp. 67197. Cambridge, MA: Blackwell.
Menn, L. (1971). Phonotactic rules in beginning speech: a study in the development of
English discourse. Lingua, 49: 1149.
(1978). Phonological units in beginning speech. In A. Bell and J. Hooper-Bybee
(eds.), Syllables and segments, pp. 15772. Amsterdam: North-Holland.
Nazzi, T., Jusczyk, P. W., and Johnson, E. K. (2000). Language discrimination by
English-learning 5-month-olds: effects of rhythm and familiarity. Journal of
Memory and Language, 43 (1), 119.
Nespor, M. and Vogel, I. (1986). Prosodic phonology. Dordrecht: Foris.
Plnat, M. (1984). Toto, Fanfa, Totor et mme guiguite sont des anars. In F. Dell, D. Hirst,
and J. R. Vergnaud (eds.), Forme sonore du langage, pp. 16181. Paris: Hermann.
(1999). Prolgomnes une tude variationniste des hypocoristiques redoublement
en franais. Cahiers de grammaire 24: 183219.
Ramus, F., Nespor, M., and Mehler, J. (1999). Correlates of linguistic rhythm in the
speech signal. Cognition 73.3: 265292
Rose, Y. (2000). Headedness and prosodic licensing in the L1 acquisition of phonology.
Unpublished PhD dissertation, McGill University.
Rose, Y. and MacWhinney, B. (2013). The PhonBank initiative. In J. Durand, U. Gut, and
G. Kristoffersen (eds.), Handbook of corpus phonology. Oxford University Press.
Scheer, T. (2004). A lateral theory of phonology: what is CVCVand why should it be?
Berlin: De Gruyter.
Selkirk, E. (1978). On the French foot: on the statute of mute e. Studies in French
Linguistics, 1, 14150.
(1984). Phonology and syntax: the relation between sound and structure. Cambridge,
MA: MIT Press.
Stager, C. L. and Werker, J. F. (1997). Infants listen for more phonetic detail in speech
perception than in word-learning tasks. Nature, 388, 3812.
Tomasello, M. (2000). The item-based nature of early syntactic development. Trends in
Cognitive Science, (4)4, 15663.
Tranel, B. (1987) Floating schwas and closed syllable adjustment in French. In W.
Dressler, H. Luschtzky, O. Pfeiffer, and J. Rennison (eds.), Phonologica 1984,
Veneziano, E. and Sinclair, H. (2000). The changing status of ller syllables on the
way to grammatical morphemes. Journal of Child Language, 27, 461500.
Verluyten, S. (1982). Recherches sur la prosodie et la mtrique du franais. Unpublished
PhD dissertation, University of Antwerp.
(1989). Lanalyse de lalexandrin: mtre ou rythme? In M. Dominicy (ed.), Le souci
des apparences: neufs tudes de potique et de mtrique, pp. 3174. Brussels:
Editions de lUniversit.
Vihman, M. M. (1976). From prespeech to speech: on early phonology. Stanford Papers
and Reports on Child Language Development, 12, 23044.
(1993). Variable paths to early word production. Journal of Phonetics, 21, 6182.
Blackwell.
(2010). Templates in adult and child language, paper presented in the Workshop on
Templates, OCP 7, January 2830, Nice.
Chapter 2.
Vihman, M. M. and Kunnari, S. (2006). The sources of phonological knowledge.
Recherches Linguistiques de Vincennes, 35, 13363.
Vihman, M. M., Macken, M. A., Miler, R., Simmons, H., and Miller, J. (1985). From
babbling to speech: a re-assessment of the continuity issue. Language, 61, 397445.
Vihman, M. M., Nakai, S., DePaolis, R. A., and Hall, P. (2004). The role of accentual
pattern in early lexical representation. Journal of Memory and Language 50, 33653.
Language and Speech, 32: 14970. Reprinted in this volume as Chapter 8.
Vihman, M. and Velleman S. (2000). Phonetics and the origins of phonology. In
N. Burton-Roberts, P. Carr, and G. Docherty (eds.), Phonological knowledge,
conceptual and empirical issues, pp. 30539. Oxford University Press.
Vihman, M., Velleman, S., and McCune L. (1994). How abstract is child phonology?
Towards an integration of linguistic and psychological approaches. In M. Yavas
(ed.), First and second language phonology, pp. 931. San Diego: Singular
Vihman, M. M. and Vihman, V-A. (2011). Fromrst words to segments: A case study in
phonological development. In I. Arnon and E. V. Clark (eds.), Experience, varia-
tion, and generalization: learning a rst language (Trends in Language Acquisition
Research 7), pp. 10933. Amsterdam: John Benjamins.
Waterson, N. (1971). Child phonology: a prosodic view. Journal of Linguistics, 7(2)
Waterson, N. (1987). Prosodic phonology: the theory and its application to language
acquisition and speech processing. Newcastle upon Tyne: Grevatt & Grevatt.
Wauquier-Gravelines, S. (2005). Statut des reprsentations phonologiques en acquisi-
tion, traitement de la parole continue et dysphasie dveloppementale. Habilitation
thesis, EHESS, Paris.
Wauquier, S. (2006). Du son au sens, acqurir ou apprendre la phonologie. Recherches
Linguistiques de Vincennes, 35, 530.
(2010). Templates, spreading and palatal patterns. Paper presented at the 18th
Manchester Phonology Meeting, May 2022, Manchester.
12 The acquisition of consonant clusters
in Polish: a case study
Marta Szreder
Introduction
The basis for childrens phonological processes has been the subject of a long-
standing debate in the literature. Should these processes be taken to reect the
tuning of an abstract rule or constraint-based system, or the development of
motor and cognitive skills? The former approach is rooted in the generative
tradition (Chomsky and Halle 1968), which postulates a rule-based (Smith
1973; Stampe 1979) or, more recently, a constraint-based (Gnanadesikan
2004; Fikkert and Levelt 2008) system as the starting point of phonological
acquisition. It assumes that development proceeds through the reorganization of
rules or the reranking of constraints. Under this approach, the basic units of
phonological organization are segments.
In contrast, the cognitive approach has assigned much more importance to the
word as a whole. Ferguson and Farwell (1975) were the rst to explicitly argue
that the word is the rst basic unit of phonological organization. Word-based
processes, demonstrated for several other children (cf. Waterson 1971; Priestly
1977), led to the proposal of word templates (Vihman and Velleman 2000),
which would serve as constraints that control the overall shape of the word
rather than affecting particular segments.
Although Ferguson and Farewell claimed that segment-based processes were
absent in the early stages of acquisition, other cognitive models include
both segments and words in their description of child phonology. Studdert-
Kennedy and Goodell (1993) attempt to explain observed processes within the
terms of the Articulatory Phonology model (Browman and Goldstein 1986,
1992), according to which real-time articulatory gestures are phonological
units, stored and produced in meaningful combinations (i.e., words). Within
this model, the process of phonological acquisition would consist of learning to
produce particular gestures and to coordinate them into word shapes. Studdert-
Kennedy and Goodell conclude that [a] childs errors in early words can arise
from paradigmatic confusions among similar gestures in a childs repertoire and
from syntagmatic difculties in coordinating the gestures that form a particular
word (p. 82). However, it appears that the two approaches could be interpreted
as complementary rather than contrasting. Articulatory Phonology, by adding
the emphasis on particular gestures, also offers a plausible explanation for the
343
emergence of whole-word processes, i.e., the development of gestural coordi-
nation. Conversely, whole-word phonology can enrich the Articulatory
Phonology model by suggesting the exact nature of the top-down cognitive
processes which interact with articulatory development.
The irreconcilable difference lies between the generative and the cognitive
approaches, which make very different predictions. An important prediction of
the generative approach is that processes will apply to all the same segments and
syllable positions, regardless of the words they appear in, i.e., they should be
triggered by the same segment-sized unit every time it occurs. The cognitive
approach makes no such prediction. Rather, segments are expected to some-
times behave differently in different words. Also, in the constraint-based
approach, processes should target units no larger than the segment, i.e., they
should be blind to other positions in the word. Again in contrast, the cognitive
approach predicts that phonological processes will sometimes change the shape
of the word as a whole. Finally, while constraints are expected to apply in every
token of a given word, under the articulatory approach, attempts at articulating a
problematic segment are expected to have different outcomes on different
occasions. Thus, variability in output is taken to be a natural developmental
phenomenon.
The acquisition of clusters in Polish, where, as in other Slavic languages, they
are pervasive in adult word forms, provides an opportunity to test the contrast-
ing predictions of these theoretical approaches. Accordingly, this chapter exam-
ines the behavior of consonant clusters in the speech of a monolingual child
acquiring Polish, with a particular focus on word-medial clusters. Previous
investigations (ukaszewicz 2007; Zydorowicz 2007) have shown that word-
medial clusters in Polish are acquired earlier in development than word-initial
clusters. This has been attributed to various constraint-related factors, from
syllable structure constraints (ukaszewicz 2007), to morphophonotactics and
markedness effects (Zydorowicz 2007).
A similar phenomenon, of early success in the production of what is usually
taken to be a difcult phonetic feature, has been reported for languages which
make use of word-medial geminates (Finnish: Savinainen-Makkonen 2007;
Arabic: Khattab and Al-Tamimi, this volume) and long consonants (Welsh:
Vihman, Nakai, and DePaolis 2006). Those studies suggest that the relative ease
with which the clusters or long consonants are acquired can be attributed, at
least in part, to their salience. However, geminates in Arabic, Finnish and Welsh
occur only in medial position, while consonant clusters are present in all
positions in the word in Polish, yet the medial position still appears to be the
easiest for children. This would suggest that the word-medial, intervocalic
position increases the salience and/or eases production of these segments.
Interestingly, in Finnish, Arabic, and Welsh, word-medial geminates and
long consonants often affect word onsets in childrens production, causing
them to be either inaccurately produced or omitted altogether. That consonant
clusters can affect the accuracy of other sounds in the word in the same way that
344 Marta Szreder
geminates and long consonants do has been suggested on the basis of one
childs developmental data from Hindi, a language which makes use of both
geminates and clusters (Vihman and Croft 2007).
As applied to Polish, the putative relative salience of word-medial over word-
initial clusters could help to explain why word-medial clusters are acquired
earlier. In addition, if word-medial clusters can be shown to affect other
positions in the word, then this would suggest that articulatory coordination
and planning play an important role in the acquisition of these segments. Such a
nding would point to formal constraints being insufcient to explain the
course of phonological development.
This chapter will examine the processes affecting consonant clusters in the
speech of a Polish child, as well as their effects on other positions in the word.
We shall primarily be concerned with three questions: (1) How do processes
vary across different positions in the word, i.e., do segments in all positions
obey the same constraints? (2) Do word-medial clusters trigger instability or
consonant omission at word onset, as in Finnish, Welsh, Arabic, and Hindi?
(3) Howsystematic are the processes, i.e., do they apply to the same clusters and
consonants regardless of the word form or the particular token they appear in?
Answers to these questions will provide evidence as to the source of child
errors (articulatory vs. formal constraints) and the units of early phonological
organization (words vs. segments). The processes found in the data appear to
reect a combination of purely motoric articulatory constraints and articulato-
rily motivated yet pre-planned patterns, supporting the Articulatory Phonology
approach. The low degree of systematicity in the behavior of particular seg-
ments does not allow for the postulation of categorical segment-based rules.
However, the systematicity found in the patterns of cluster substitution and in
the behavior of consonants in various word positions suggests that the word as a
whole constitutes an important unit in the childs phonological system. It is
argued that a combination of motoric and attention-related factors leads to the
emergence of phonological systematicity.
Method
The data for the study was collected from the authors son Grzenio (/g/), a
monolingual child acquiring Polish. For the purpose of the current analysis, six
half-hour recordings of spontaneous speech in the home environment were
selected, at intervals of twenty to twenty-six days. At the beginning of the
study, Grzenio was 1;5.28, with an estimated cumulative vocabulary of about 50
words (MLU 1.2). The recordings ended when he was 1;9.28 and his vocabu-
lary was estimated at about 250 words (MLU 2.6). The total number of
interpretable tokens recorded was 1,402, and the total number of different
lexical items was 181. As there was no evidence of a qualitative change in
Grzenios phonological organization or signicant improvement of articulatory
The acquisition of consonant clusters in Polish 345
skills over the four months, the data will be treated synchronically, with no
attention to word form changes over that period.
Table 12.1 presents the consonant inventory of Polish, with consonants
produced by Grzenio highlighted in grey. Note that affricates are transcribed
here without a linking diacritic. While [t], [d] + fricative clusters are possible in
Polish, they are often affricated in spoken language, and they were not attemp-
ted by Grzenio. Therefore, all such sequences in this chapter are intended to
represent affricates.
To give an idea of the target system and to illustrate the magnitude of the
challenge that consonant clusters pose to children acquiring Polish, Table 12.2
(based on Milewski 2005) presents the number of consonant cluster types in
different word positions in data from scientic texts, artistic prose, preschool
children (aged 37), and Grzenio.
Table 12.1. The consonant inventory of Polish with Grzenios consonants
highlighted in grey (* = only recorded once in Grzenios speech)
Place/manner of
articulation Bilabial Labiodental Dental Alveolar Palato-alveolar Palatal Velar
Plosive p t k
b d g
Fricative f* s x
v z
Affricate ts t t
dz d d
Nasal m n
Liquid lateral l
rhotic r
glide w j
Table 12.2. Consonant cluster types in Polish (based on Milewski 2005 and
current data)
Scientic texts Artistic prose Preschool children Grzenio
Initial consonant clusters
Number of types 70 72 46 11
% 29.3 28.06 29.11 23.91
Medial consonant clusters
% 59.8 61.5 58.86 58.69
Final consonant clusters
% 10.9 9.9 12.03 17.39
Total 239 253 158 47
346 Marta Szreder
Results
Word-initial consonant clusters
As can be seen from Table 12.2, far fewer clusters occur word-initially than
word-medially in Polish. This tendency was also observed for Grzenio, for
whom word-initial clusters accounted for 23.91 percent of all the clusters
produced, in line with the frequency observed for preschoolers and adult speak-
ers of Polish (Milewski 2005). There were only eleven types of word-initial
clusters in Grzenios repertoire, all of which are presented in Table 12.3 along
with their targets. It can be observed that most targets are of the structure C1
[continuant] C2[+continuant], and that this structure is preserved in the child
form. The exception is the cluster /p/, which has the opposite order in the target
and which Grzenio reverses.
However, out of 309 tokens attempting 45 different target words with an onset
cluster, the cluster was reduced in all but 25 tokens. Moreover, out of those 25,
only one was reproduced correctly (/kl/). Among the remaining 284 tokens of
words with an onset cluster in target, the cluster was reduced to or replaced by a
single consonant in 277 child forms and was omitted altogether in 7. Table 12.4
presents a selection of target words with consonant clusters in word-initial
position along with the child forms.
ukaszewicz (2007) also reports numerous cases of onset cluster reduction,
which she nds to be due either to sonority-based deletion, in which only the
less sonorant consonant is retained, or to coalescence, in which the two con-
sonants are replaced by one that includes phonetic properties of both of the
original sounds. However, Grzenios forms exhibit no such consistency, and the
Table 12.3. Word onset clusters
produced by Grzenio
Target Grzenio
br bw
b
kfj kx
tl
kl hj
kj
kl
tj
kr kj
kl
k
tj
gm bw
p p
process of reducing clusters to only one of the target consonants does not seem
to be applied in a systematic fashion, as is apparent from the examples in
Table 12.4. For example, clusters with [s] or [] as C1 behaved differently in
different words. pi sleeps /pi/ sometimes underwent metathesis, sometimes
reduction to the stop only. Yet, the stop was deleted in krab crab /krap/, which
was produced with an initial nasal palatal [] most likely a substitute for the
liquid. In so elephant /sw/ the cluster was replaced with a nasal palatal,
resulting in harmony with the coda. Finally, in smok dragon /smk/ the cluster
was omitted altogether (or part of it moved to the nal cluster a kind of
metathesis).
Moreover, different target clusters were often replaced with the same
sound. For example, the initial nasal palatal was also used in the word
pszczka bee /ptuwka/, where it bears no resemblance to any of the
target consonants. In fact, using coronal and dorsal segments in place of
word-initial clusters was a pattern that was to some extent regular, in that
50 percent of the clusters that underwent reduction were either reduced to
or replaced by coronal consonants, a further 44 percent were produced with
or replaced by dorsal consonants, and only 6 percent were produced with or
replaced by labials (although labials were present in 30 percent of the target
clusters).
In sum, although the proportion of word-onset clusters to all clusters in
Grzenios data was about the same as for adult Polish, their production was
still highly unstable. Noncontinuant + continuant combinations were the only
ones produced, but most of the time even target clusters conforming to this
pattern were reduced. The only word that was pronounced with the correct
cluster (klocki blocks / kltski/ [klaki]), had as many as ve variants, in two of
which the cluster underwent reduction. This indicates that even this one cluster
/kl/, which was sometimes correctly reproduced, was not stable enough to be
Table 12.4. Examples of child forms with a word-initial cluster
Onset CC not reduced Onset CC reduced/ omitted
Target/Gloss Target IPA Grzenio IPA Target/gloss Target IPA Grzenio IPA
chrupki crisps xrupki hlupki Grzenio g ,
da
grzmi thunders (v) gmi bwi pszczka bee (dim.) ptuwka upk
kredka crayon krtka kjaxka so elephant sw n,
smok dragon smk k

klocki blocks kltski tjaka,
klki
koki,
taki
krab crab krap tjapk ap:ka
pi sleeps pi pi pi
348 Marta Szreder
considered fully acquired. Finally, the lack of a clear pattern for the treatment of
any given cluster prevents straightforward attribution of these patterns to the
phonetic characteristics of particular target segments.
Word-medial consonant clusters
Word-medial consonant clusters have the largest number of combinations of all
Polish clusters (see Table 12.2), and this was also true for Grzenio. While
Grzenio produced only noncontinuant + continuant clusters at word onset, the
word-medial clusters he attempted were mostly of the opposite form, i.e.,
continuant + noncontinuant. In the adult language as well, consonant clusters
of this type are relatively rare in word-initial position but frequent word-medially.
Word-medial clusters were also produced far more frequently than word-initial
clusters. Interestingly, the number of word targets Grzenio attempted was similar
for both positions: 45 word-onset-cluster word types (309 tokens) vs. 50 word-
medial-cluster word types (270 tokens). However, medial cluster reduction
occurred in only 73 of the 270 tokens (27 percent), as compared to 277 of the
309 tokens (90 percent) containing word-onset clusters. In 197 tokens the medial
cluster was retained, although it was often produced inaccurately.
Zydorowicz (2007), in a case study of another Polish child, reports that
morphological clusters seem to be more stable than lexical clusters. The rst
nonreduced clusters she observes in her subject, Zosia (2;1), are all word-medial
and all cross-morpheme boundaries, while some of the word-medial clusters
occurring within single morphemes still get reduced. Zydorowicz argues that
this points to a special status of morphological clusters. However, no such
tendency was observed in Grzenios data. This difference does not result from
Grzenio being more experienced in producing clusters: it is simply that in the
period reported here his speech included very few morphological endings. In
fact, the only morphological sufx in the data that results in a word-medial
cluster is the diminutive, and there is no evidence of its being used productively
(i.e., the words marked with this sufx appear only in the diminutive).
Grzenios production of word-medial clusters seems to have been more
systematic than his treatment of word-onset clusters. Moreover, the substitution
pattern was more strictly dened. All C2s were either coronal or dorsal obstru-
ents, in both targets and child forms. C1 was most often a continuant, usually
agreeing in place of articulation with C2. The only exception to these tendencies
was the stop [p], which appeared as C1 in place of all labial C1 targets ([p] was
also the only stop used in this position, apart from the single instance of
gemination of [k]; see Table 12.5). All word-medial clusters produced by
Grzenio are presented in Table 12.5, along with their targets, sorted by C1 in
the child form (C2 being a coronal or dorsal obstruent in all cases).
We can therefore extract three main patterns for Grzenios medial clusters.
The rst pattern applies to all clusters with a labial C1. The preferred C2 is a
coronal or dorsal obstruent, and so the cluster /br/ is replaced with /pt/ or /pt/.
Interestingly, there is one instance of a cluster with no labial consonant in the
target form namely, /rt/ becoming /pt/.
The second pattern turns sonorant-obstruent clusters into a sequence of
homorganic nasal + obstruent. This sequence was also reported as a frequent
cluster modication by both ukaszewicz (2007) and Zydorowicz (2007).
Again, however, we can see that the pattern applies to two target clusters that
fail to match the criteria: /pk/ and /tk/, which both become /k/. We can also see
that there is a strong preference for coronal and dorsal segments, as even the
cluster /mp/ is transformed into /nt/.
The third pattern applies to obstruent sequences, which are transformed into
continuant + noncontinuant clusters. This can be seen in the case of the clusters
/tk/, /tsk/, and /tk/, in which C2 is reproduced accurately but C1 appears as any
one of several different fricatives.
The three patterns: [p] + obstruent, homorganic nasal + obstruent, and con-
tinuant + noncontinuant function as preferred output forms or rough templates, in
Table 12.5. Word-medial clusters produced by Grzenio, sorted by C1 in child
form
Labial (30 tokens) Nasal (60 tokens) Fricative (98 tokens) Other (12 tokens) Reduced (75 tokens)
Target Grzenio Target Grzenio Target Grzenio Target Grzenio Target Grzenio
br pt jd d t t tk kk br b
pt t jt jt tk
pk pk t sk t xts jt t
pt pt lk k st t
pt mp nt t lk k
rt pt nd d t k k
wk pk t t k k pr
k nd hk st
k t t
pk k t s
rdz d ht t
rt t tk k t
k hk
tk k k stk
wk g tsk k t t
k k t
hk tsk h
jk tk h
tk hg k
hk wk
k xts t
tn t
xts t
t
ht
t
350 Marta Szreder
the sense that they are not predictable on the basis of the target. Only about half of
the words had the same type of cluster (i.e., adhering to one of the above patterns)
in the target and in the child form (e.g., rybka sh /rpka/ [pka]; winka pig
/ka/ [ika]); in this sense they were selected (Vihman and Velleman 2000).
The other half were adapted, meaning that the target cluster was transformed to
match the pattern. Table 12.6 presents a selection of child forms with a cluster in
word-medial position, sorted according to this distinction.
Again, as was the case with word-onset clusters, we can see that in spite of the
general systematicity of the patterns, their application is by no means fully
consistent. For example, the name Marta /marta/ is reproduced with two different
clusters: [ka] and [ata], and the same is true for soczku juice /stku/, which
appears in two quite different forms: [xku] and [ku]. Moreover, in the case of
the word spodenki trousers /spdeki/, the child form is [dodandi], despite the
childs preferred cluster being present in the target form. As regards the 75 word
tokens in which the medial cluster was reduced, there was also no obvious pattern
as to which segment was retained: in 32 tokens the cluster was reduced to or
replaced with a noncontinuant, but in the remaining 43, with a continuant.
Interestingly, there were also cases of cluster insertion, where the target form
had no cluster but the child form did. Table 12.7 presents a selection of child
forms with an added cluster in medial position.
Word-nal consonant clusters
Unlike word-initial and word-medial clusters, the percentage of clusters
Grzenio used in word-nal position was higher than observed for adults (see
Table 12.2). Only two types of word-nal cluster were used accurately: /t/ and
Table 12.6. Examples of child forms with a consonant cluster in word-medial
position
Select Adapt
Target/gloss Target IPA Grzenio IPA Target/gloss Target IPA Grzenio IPA
babcia grandma bapta apta grzeczny good gtn kti
ukaszka (name, Gen.) wukaka kaka, kahka kredka crayon krtka kjaxka
nie chc
not want
xts t Marta (name) marta ka, ata
nk
leg (Acc.)
nuk ik pszczka
bee
ptuwka upk
po prostu
just so
pprstu tttu soczku
juice (dim., Gen.)
stku xku, ku
rybka
sh (dim.)
rpka pka spodenki trousers (dim.) spdki dodandi
winka
pig (dim.)
ka ika zebra
zebra
zbra pta
/t/, adhering to the patterns observed for word-medial clusters. The rest of the
clusters were mainly results of cluster insertion, but they too obeyed the
patterns: labial + obstruent (e.g., krab crab /krap/ [tjapk]), nasal + obstruent
(e.g., chce want /xts/ [tk]), or continuant + noncontinuant (e.g., lee to
lie / lt/ [jat]). Table 12.8 presents all of Grzenios word-nal clusters.
Word-initial singleton consonants
At the time of the study, the child produced word-initial single consonants with
high accuracy, ranging from 75% for labials and 86% for velars to 95% for
coronals and palatals. Interestingly, even within this very small margin of
variation, there was a difference in the behavior of the three places of stop
articulation. The coronals were the least variable of the three places, with the
variability usually attributable to articulatory factors. That is, the sounds often
underwent palatalization, and sometimes even affrication, as in the word tatu
daddy /tatu/, which often appeared as [tati].
Labial stops, on the other hand, were rarely affected by segment-based
processes (there were nine cases in total of a change in voicing or manner),
but, in comparison to the coronals, they were more prone to whole-word
processes such as assimilation, resulting in lower accuracy overall. This is
Table 12.7. Examples of cluster insertion
Target/gloss Target IPA Grzenio IPA
buty shoes but t
czyta to read ttat titat
dywan carpet dvan dida
idzie walks id id
krab crab krap ap:ka, tjapk
ukasz (name) wuka guka
oko eye k k
Table 12.8. Examples of word-nal clusters
Accurate Inserted
cze hi tt tt chc I want xts tk
jest is jst it t krab crab krap tjapk
je to eat jt jt lee to lie down lt jat
pu let go (imp.) put put mikrofon microphone mikrfn ptk
zdj to take off zdjt dndn:t tukan toucan tukan tukak
zej to get down zjt zjt
352 Marta Szreder
illustrated by the word buty shoes /but/, usually produced as [nuta], and
babcia grandma /bapta/, almost always rendered as [apta]. As regards
the velar stops [k] and [g], the former was usually pronounced correctly, but
the latter was rather infrequent and sometimes replaced with another sound, as
in the word gitara guitar /gitara/ [titaja].
In summary, while the stops were relatively stable in word-initial position,
coronals were the least susceptible to the inuence of other segments in the word,
despite being at the same time the least precisely articulated (often varying with
palatals), whereas labials did not undergo many segment-based processes but
were often at least partially assimilated to other consonants in the word.
While coronal stops were sometimes replaced with affricates, word-initial
affricates were also often reduced to stops. Only the palato-alveolar affricates
were present word-initially and those were usually produced accurately (100%
accuracy for [t] and 93% for [d]). Dental and alveolar affricates were pala-
talized to [t] and [d] approximately half of the time and at other times were
reduced to dental stops (as in the word cze hi /tt/, produced as [te]), but
these affricates were never replaced by other consonants.
A similar pattern to that of stops was observed for word-onset nasal seg-
ments, whose accuracy ranged from 17% for the dental [n] and 66% for the
labial [m] to 83% for the palatal []. In the case of the coronals [n] and [],
variability was almost entirely limited to interactions between the two, i.e., [n]
was replaced only by [], while [] was pronounced as either [n] or (less often)
[j]. Again, labial [m] was an exception: almost all of the 34% of inaccurate
tokens were instances of (partial) consonant harmony (e.g., mi teddy /mi/
[i]). That the variation among coronals can be attributed to articulatory
difculties is further conrmed by the behavior of the glide [j], which not
only replaced nasal [] in some words, but was also replaced by it in others,
although it was accurately produced 86% of the time (for comparison, the labial
glide [w] was never used accurately.)
Fricatives were still relatively undeveloped and infrequent. Apart from a
single appearance of [f] during the nal session, only palato-alveolar [] (used
interchangeably with palatal []) and velar [x] (used interchangeably with
glottal [h]) were produced in word-initial position. Those consonants were
also used to replace other fricatives, along with a range of other sounds.
Finally, as is typical for children of his age, except for a single instance of [l]
Grzenio did not produce liquids, which he usually replaced with glides.
In summary, on the basis of the behavior of word-initial segments, we can see
that obstruents and glides were the most developed of Grzenios consonants,
and among themcoronal and palatal segments played a special part. It is perhaps
worth noting that the coronals are particularly difcult in Polish, as the language
distinguishes between two places of articulation (alveolar and palato-alveolar)
and two constriction degrees (fricative and affricate) for those segments.
Combined with voicing distinction, this results in eight coronal consonants
that are very similar articulatorily ([], [], [t], [d], [], [], [t], [d]). Not
surprisingly, at the time of the study, Grzenio had not yet acquired the subtleties
of their production. He used dental stops (but not affricates), none of the
alveolars, and all palato-alveolars except for the voiced fricative []. Still, the
consonants that he produced were often used interchangeably. On the other
hand, those relatively unstable sounds, when produced at word onset, were
seldom inuenced by other positions in the word. In fact, if we consider all
inaccurate child forms, articulatory errors (i.e., variable degree of voicing,
nasalization, palatalization, and affrication) affect 80% of tokens with word-
initial coronal or palatal obstruents in the target, but only 7.5% and 28% of
tokens produced for targets starting with labial and dorsal obstruents, respec-
tively. The remaining errors are the result of either omission or assimilation.
Table 12.9 presents instances of omission of word onset.
The data in Table 12.9 illustrate that labial segments were the most likely to
be omitted, whether labial stop [p], nasal [m] or glide [w]. Nevertheless, there
were also cases of omission of coronal and palatal segments, even of a con-
sonant as stable as the glide [j]. What seemed to trigger those processes was the
presence of a consonant cluster later in the word, as in jeszcze more /jt/,
reduced to [t]. In fact, even the variability within the articulatorily moti-
vated range appeared most frequently in words with word-medial clusters. Out
of the 23 most variable word types (i.e., those for which four or more different
child forms were recorded), 12 (52%) had a consonant cluster in word-medial
position in the target form. In comparison, out of 39 words that appeared in only
one form (but in more than one token), only 7 (17%) had a word-medial cluster.
To summarize, while the accuracy of word-initial singleton consonants was
high, the occasional errors that did appear fell into one of two general categories.
Firstly, there were errors that could be said to result fromimprecise articulation or,
in the case of consonants that had not yet been acquired, from substitution with a
similar sound. The processes that fell into this rst category all involved variation
Table 12.9. Examples of word-initial consonant omission
Target/gloss Target IPA Grzenio IPA
dobranoc goodnight dbrants at
jest is jst ex
jeszcze more jt it
ko bed wuk uhk
ukasz (name) wuka uka
ukaszek (name, dim.) wukak ukahk
Marta (name) marta ka
mi teddy bear mi i
misia teddy bear (Gen.) mia ia
pika ball piwka ika
pompon pompon pmpn ntn
soczku juice (dim., Gen.) stku aku
354 Marta Szreder
in voicing, nasalization, palatalization, and affrication, or substitution strategies
common also in children acquiring English, such as the gliding of liquids (see
Grunwell 1985). The second category comprised errors in which the word-initial
consonant was replaced by a sound that shared more properties with consonants
appearing later in the word than with the target sound. In this sense, those
processes can be dened as word-based. Table 12.10 presents examples of
child forms with inaccurate word onset, sorted according to this distinction.
The above selection demonstrates that there were no substitutions which could
not be attributed to either articulatory or whole-word coordination issues. However,
it is worth noting that in some cases it was not entirely clear whether a particular
child form was a result of a segment-based or a word-based process. For example,
the initial [t] in the childs rendition of the word tatu daddy /tatu/ [tati] could
have resulted from assimilation of the target [t] to the word-nal [], or it could
have been due to segment-based palatalization. In such cases, the process in
question was classied as segment-based, in order not to overestimate Grzenios
whole-word bias.
Discussion
Do processes vary across different positions in the word?
A comparison of the behavior of consonants in the four positions discussed
above (singletons at word onset; word-initial, word-medial, and word-nal
Table 12.10. Examples of child forms with inaccurate word onset, sorted by
type
Segment-based processes Whole-word processes
bardzo very bardz mad babcia grandma bapta apta
burza storm bua wua buty shoes but nuta
co what ts t gitara guitar gitara titaja
czapki hats tapki tapki ukasz (name) wuka guka
cze hi tt t Marta (name) marta ata
czysty clean tst tit mi teddy bear mi ji, i
jestem I am jstm m po prostu just so pprstu tttu
ko horse k ga Wanda (name) vanda dada
ley lies (v) l ji zebra zebra zbra wb, pta
pan mister pan ba zej to get down zjt jjt
robi to make rbit jbit, pit
rybka sh (dim.) rpka pka
sam alone sam am
szafa closet afa hafa
tatu daddy tatu tati
wyla he spilled vlaw blaw
consonant clusters) suggests that each poses its own challenges to the child.
This is particularly apparent in the differential treatment of the different types of
consonant in different positions in the word.
We have seen that labial segments were not very stable in the childs
production at the time of the recordings. The fricatives were just emerging,
voiced [v] was not produced at all, and voiceless [f] was recorded only once.
The glide [w] did appear, but was never used accurately (only as a substitute for
another consonant). In fact, of all labial consonants available in the target
language, only stops and the nasal [m] were used consistently. However, even
these segments exhibited a much higher degree of variability than their coronal
and dorsal counterparts, and in particular variability that was not limited to
articulatory modication, but was often the result of assimilation to another
consonant. Perhaps not surprisingly, labial stops were also the most likely to be
omitted in word-onset clusters. We could argue that they still posed difculties
of articulation for Grzenio, and thus were much more vulnerable when co-
articulatory factors came into play. Nevertheless, the situation of labial stops
was different when they appeared as C1 in word-medial clusters. Specically,
they seemed to be the only stops immune to the cluster template, which replaced
all C1 stops with continuants. Highly susceptible to variation at word onset,
even if not constituting a part of a cluster, they were almost change-resistant
when in syllable coda, even though there was another immediately following
consonant that could have been expected to affect them.
The situation of coronal and dorsal segments was very different in this
respect. Although the fricatives and liquids were seldom produced at word
onset and instead were usually replaced by other segments, stops, affricates,
and glides, while not always precisely articulated, were very rarely affected by
other segments in the word. Moreover, they were also usually retained in word-
initial clusters, and in fact the very few clusters that Grzenio produced in this
position consisted of a coronal or dorsal stop or affricate followed by a liquid or
glide. But again, the sounds behaved very differently in word-medial clusters.
Here, stops and affricates, which were so stable at word onset, were almost
invariably transformed into fricatives or nasals when they appeared as C1 in
medial clusters. On the other hand, C2 in word-medial clusters tended to be
coronal or dorsal even if C1 was the labial [p]. The preferred structure of clusters
is also apparent in the templates applied to many of them: C1[continuant]+C2
[+continuant] for word-initial clusters, and C1[+continuant]+C2[continuant]
for word-medial position (with the exception of [p]+[obstruent] clusters).
As regards manner of articulation, liquids were in general produced only as
C2 in word-initial clusters, while fricatives occurred as C1 in word-medial
clusters. The behavior of fricatives here thus conrms the ndings of
Ferguson (1975), according to which fricatives tend to be acquired rst in
syllable coda. Moreover, it shows that the constraints on the form of clusters
differed depending on word position. First of all, Grzenio mainly produced
consonant clusters in word-medial position. Also, whereas some of Grzenios
356 Marta Szreder
preferred clusters are not allowed word-initially in the target language (e.g., /k/,
/nt/), in other cases the target clusters are structurally the same in both positions
but were attempted only word-medially by the child. For example, in the word
chc want /xts/, the onset cluster was reduced to /t/; but when negation was
added, so that the cluster appeared intervocalically, it was fully preserved in the
child form: nie chc not want /xts/ [t]. In short, the sets of clusters that
Grzenio produced in the two positions were mutually exclusive.
In general, the constraints on the phonological behavior of consonants in
Grzenios production seem to be strongly dependent on their particular position
in the word, rather than only on the phonetic identity of particular segments.
This is not to say that the the phonetic identity is irrelevant, as the application of
the cluster template was to some extent based on the target consonants, as is
apparent from the fact that the cluster /pk/ was immune to it. Furthermore, as we
have seen, some of the processes affecting Grzenios consonants appear to have
been segment-based, in that the variation observed for a given sound could not
be explained by the inuence of other segments in the word. However, like the
errors discussed by Studdert-Kennedy and Goodell (1993), the segment-based
processes were always articulatorily motivated, in the sense that the childs
rendition of a given segment was close to the target with respect to its articulatory
properties. For example, the initial [b] in burza storm /bua/ varied with another
labial segment, [w]; [] often varied with [j], another palatal segment; and the
coronal obstruents appeared with variable degrees of palatalisation and affrica-
tion. Moreover, the templates applied to clusters often included consonants
agreeing in place of articulation, which would suggest that articulatory factors
might also be partly responsible for the emergence of the patterns. More specif-
ically, producing a sequence of continuant followed by noncontinuant with the
same place of articulation in fact requires only an increase in constriction, and
should thus be easier to produce than other types of consonant clusters.
Do word-medial clusters trigger instability or consonant omission
at word onset, as in Finnish, Arabic, Welsh, and Hindi?
While many of Grzenios processes could be explained by imprecise articula-
tion, those processes were more likely to occur under particular conditions. First
of all, word-medial consonant clusters, like the long consonants of Finnish and
Welsh, affected the accuracy of word onset. This suggests that the clusters are
particularly salient perceptually: probably because of both their length and the
fact that they are still challenging to the child, and their production therefore
requires more attention, which results in less focus on the onset. In addition,
articulatory difculties affecting the word onset might be exacerbated by the
planning required to produce a word with both an onset and a complex sequence
of consonants later on.
Secondly, the fact that the consonant cluster patterns were sometimes applied
to consonant clusters irrespective of their structure indicates that there is more to
Grzenios phonological system than just on-line articulatory difculties.
Specically, it often seemed that the template targeted clusters on the basis of
their property of being a cluster and was not reserved for particularly trouble-
some combinations of sounds. The template was also applied to the same words
in different ways on different occasions. While difculties in articulation can
certainly be said to underlie the emergence of the pattern, the strategy employed
to deal with those difculties seems to be based on a generalization that suggests
the presence of an emerging phonological system. Given that remembering
every word in detail requires extensive memory resources, such generalizations
are to be expected: using a pre-prepared template should considerably increase
the speed of learning challenging words.
How systematic are the processes, i.e., do they apply to the same
clusters and consonants regardless of the word form and the token
they appear in?
To sum up the evidence that has been presented with regard to this last question,
let us examine once more some of the most telling examples from Grzenios
data. Examples (1) through (3) illustrate what appears to be purely articulatory
variability. The initial labial [b] is rendered correctly in (1) and in one variant of
(2). However, in the other variant it is substituted by another labial consonant:
[w]. The same happens to the labial [m] in (3), which in one variant appears as a
glide and in the other is omitted altogether. It is also omitted in (4). Although
these processes affect single segments, the extent of variability makes it impos-
sible to postulate any segment-based rules. Imprecise gestural control seems to
better account for these processes.
(1) babcia grandma /bapta/ [bapta]
(2) burza storm /bua/ [bua], [wua]
(3) mi teddy /mi/ [w], [i]
(4) Marta /marta/ [ka]
Examples (5) through (7) present other variants of targets (1), (3), and (4), but
this time the labial segment assumes a coronal place of articulation. This is
likely to be due to the inuence of the consonants later in the word, and so a
whole-word, rather than segment-based, process. In addition, the substituted
palatal undergoes articulatorily motivated changes as well, when it varies
between the glide [j] and the nasal [] (in (5) also [n] and [d]).
(5) babcia grandma /bapta/ [japta], [apta], [npt], [dapta]
(6) mi teddy /mi/ [ji], [i]
(7) Marta /marta/ [jata], [apta]
358 Marta Szreder
At the same time, the word-medial cluster in (4) and (7) appears in three
different forms. Each of the forms is compatible with one of the cluster
templates that Grzenio used: [p] + obstruent (despite the lack of a labial in the
target cluster) and homorganic nasal + obstruent. The same happens with the
word-medial cluster in (8), despite the fact that the target cluster is already of
the preferred form. This shows that the template is not applied according to the
phonetic identity of particular segments: the same cluster can be substituted for
different targets and the same target can be replaced with different clusters. In
addition, in (8) the word-initial cluster is reduced and harmonized with the
following coronal segment again, a whole-word process.
(8) spodenki trousers /spdki/ [ddki], [ddandi]
In (9), the word-initial cluster has the preferred structure, but is nevertheless
sometimes deleted. Where it is retained, it appears in two different forms, one of
which, /tj/, is the same as the one in (10), presumably because of the similar
target form. However, (10) also has a variant with word-initial palatal [], which
is not the case with (9).
(9) klocki building blocks /kltski/ [tjaki], [klaki], [taki], [kki]
(10) krab crab /krap/ [apka], [tjapk]
In other words, each of the transformations in (1) through (10) can be explained
by at least one of the relatively regular processes that were observed for the data
set as a whole. Nevertheless, these processes often not only act together, but are
also applied in a broadly unsystematic way, making it impossible to postulate
any categorical rules for Grzenios production. It seems that these processes are
much more plausibly explained in terms of the three main sources we have
discussed: (1) confusions between similar gestures (such as [b] and [w]); (2)
problems in planning and coordinating sequential gestures inside words (solved
sometimes by repeating the same gesture twice, i.e., consonant harmony); (3)
the emerging phonological system that is being built in large part on the basis of
the childs own production, leading to certain gestural schemata being general-
ized to words which do not share segmental structure (e.g., [k] being sub-
stituted for [rt] in Marta /marta/).
Conclusion
The nature of the processes as well as the manner of their application seem to
suggest that articulatory and planning factors are the main source of the childs
errors. Moreover, there is evidence that not only particular segments, but also
whole words constitute units of phonological organization for the child. First of
all, the processes are triggered by the overall shape of the word, in that there is
notable interaction between initial and medial position, as word-medial clusters
affect the stability of word-onset consonants. Secondly, the processes affect the
overall shape of the word, in the sense that there are templates, or favored
articulatory patterns, for consonant clusters depending on their position in
the word, rather than on the properties of the particular consonants forming
the target cluster. Finally, although motivated with regard to articulation, the
processes are neither categorical nor obligatory, as they affect only a subset of
the potential targets and they do so only part of the time. Therefore, although
there are broad regularities in the childs production, the forms are largely
unpredictable and resist formulation in terms of any segment-based rules. The
observed patterns suggest that the childs phonological organization reects a
combination of articulatory, planning, and attentional or memory factors and
their interrelations within particular words. Importantly, this organization
appears to be built on the childs linguistic experience, as opposed to being
a preexisting structure adjusted to the input, as suggested by the nativist
approach.
References
Browman, C. P. and Goldstein, L. (1986). Towards an articulatory phonology.
Phonology Yearbook, 3, 21952.
(1992). Articulatory Phonology: an overview. Phonetica, 49, 15580.
Row.
Ferguson, C. A. (1975). Fricatives in child language acquisition. In L. Hellman (ed.),
Proceedings of the Eleventh International Congress of Linguists, pp. 64764.
Bologna: Il Mulino.
Fikkert, P. and Levelt, C. C. (2008). How does place fall into place? The lexicon and
emergent constraints in the developing phonological grammar. In P. Avery, B.
Elan Dresher, and K. Rice (eds.), Contrast in phonology: perception and
Acquisition, pp. 23167. Berlin: Mouton.
Gnanadesikan, A. (2004). Markedness and faithfulness constraints in child phonology.
In R. Kager, J. Pater, and W. Zonneveld (eds.), Constraints in phonological
acquisition, pp. 73108. Cambridge University Press.
Grunwell, P. (1985). Phonological assessment of child speech (PACS). Windsor: NFER-
Nelson.
ukaszewicz, B. (2007). Reduction in syllable onsets in the acquisition of Polish:
deletion, coalescence, metathesis and gemination. Journal of Child Language,
34(1), 5382.
Milewski, S. (2005). Grupy spgoskowe w jzyku mwionym dzieci przedszkolnych.
LOGOPEDA, 1(1), 532.
First Language, 27(4), 34759. Reprinted in this volume as Chapter 13.
360 Marta Szreder
Press.
Stampe, D. (1979). A dissertation on Natural Phonology. New York: Garland.
Studdert-Kennedy, M. and Goodell, E. W. (1993). Acoustic evidence for the develop-
ment of gestural coordination in the speech of 2-year-olds: a longitudinal study.
Journal of Speech and Hearing Research, 33, 70727.
Chapter 2.
Vihman, M. M., Nakai, S., and DePaolis, R. A. (2006). Getting the rhythm right: a cross-
linguistic study of segmental duration in babbling and rst words. In L. Goldstein,
D. Whalen, and C. Best (eds.), Laboratory Phonology 8, pp. 34166. New York:
Mouton de Gruyter.
Vihman, M. M. and Velleman, S. L. (2000). The construction of a rst phonology.
Zydorowicz, P. (2007). Polish morphonotactics in rst language acquisition. In F. Menz
and M. Rheindorf (eds.), Wiener linguistische Gazette, 74, 2444.
13 Geminate template: a model for
rst Finnish words
Tuula Savinainen-Makkonen
Introduction
Childrens rst productive word shapes are often CV syllables. It has been
claimed that all children produce word-initial consonants (Bernhardt and
Stemberger 1998; Jakobson, 1941/1968; Stoel-Gammon 1985). The initial
position is often seen as the most stable because new consonants are usually
rst acquired in this position (Ferguson and Farwell 1975), although some
consonants, such as /k/ and fricatives, frequently rst appear word-nally
(Dinnsen 1996; Edwards 1979, 1996; Stoel-Gammon 2002) and some learners
show child-specic preferences of favoring word-nal consonants more gen-
erally (Menn 1971; Stoel-Gammon and Cooper 1984). The strength of the word
position can be measured not only according to the number of new phonemes,
but also on the basis of phonological processes (Grunwell 1985; Ingram 1989;
Stampe 1969, 1979) for example, whether the different processes, such as
assimilation and omission, affect segments in the initial or medial position.
According to Stoel-Gammon (1996), phonemes are often produced more accu-
rately in word-initial position. In children learning English, the omission of a
word-initial consonant has been classied as an atypical process (Grunwell
1985; Howell and Dean 1994). In addition, when the onset is restricted to a
single consonant in the rst words of most children so that word-initial con-
sonant clusters are rare, the deletion of both members of the cluster is very rare
(Chin and Dinnsen 1992; Ingram 1989).
However, segmental accounts do not do justice to all early child data. Some
childrens early words fall into templates (e.g., Ingram 1999; Macken 1993;
Menn 1983; Vihman 1991). Template matching is a modication strategy where
the children, due to their more or less constrained output template, rearrange the
sounds of adult words in various ways to t into their own template. In addition
to several individual templates, one of the most often mentioned is the trochaic
template. Allen and Hawkins (1978, 1980) found that within longer words there
is a preference for strongweak (SW) sequences over weakstrong (WS)
sequences. On this basis, children produce a strong syllable followed by an
This research and the preparation of the manuscript were supported by a grant from the Emil
Aaltonen Foundation.
362
optional weak syllable. The weak syllable that does not t the SW template is
omitted (e.g., [nn] banana). The same preference is also found in early
perception. Children at 9 months of age prefer listening to SW sequences over
WS sequences (Echols, Crowhurst, and Childers 1997; Jusczyk, Cutler, and
Redanz 1993). Although the segments or the features of the unstressed syllables
that are retained in production reveal that the elements of unstressed syllables
are also registered by the child (Johnson, Lewis, and Hogan 1997), several
approaches assume that children pay particular attention to stressed syllables
and set up trochaic templates (SW) in their early word production (Fikkert 1994;
Gerken 1994; Wijnen, Krikhaar, and den Os 1994). However, the stressed
syllables are not the only acoustically salient guide. Kehoe and Stoel-Gammon
(1997) found that segmental factors also inuence truncation rates. In addition to
perceptual factors, possible explanations for the segmental effect include articu-
latory factors, the effects of imitation, and syllabication tendencies.
There is a growing body of cross-linguistic data on early phonological develop-
ment. It is particularly important to add to this empirical base evidence from a
language that is quite different from English in its phonological structure. The
phonological patterns of adult Finnish are very different fromthose of most of the
languages for which we have acquisition data. Indeed, the latest studies have
provided interesting observations on Finnish childrens early word forms. The
omission of a word-initial consonant is found to be fairly common during the
early stages of speech development in Finnish children (Kunnari 2000;
Savinainen-Makkonen 2000a, 2001) despite the fact that in Finnish the primary
stress always falls on the rst syllable of the word, e.g., kuk.ka ower (SW), ba.
naa.ni banana (SWW). It has been suggested that the length of the words may
offer an explanation (Savinainen-Makkonen 2000a, 2001). In English, monosyl-
labic words are frequent but in Finnish they are rare; of the 70,000 words in the
Reverse dictionary of modern standard Finnish, only 0.1 percent are monosyl-
labic (Tuomi 1980, cited in Karlsson 1983). Since English words are short,
children cannot afford to omit the initial consonant. In Finnish, on the other
hand, children hear inected words
1
with three or more syllables, but are not able
to master the whole word. Target words are simply too long and too complex to be
pronounced correctly, so something must be omitted. Indeed some evidence has
been found that in monosyllabic Finnish words the realization of word-initial
consonants may be more frequent (Savinainen-Makkonen 2000a). However,
even though monosyllabic function words are among the most frequent word
types in Finnish according to the frequency dictionary, there are so few mono-
syllabic words in child Finnish that we would need an experimental study with
nonsense words to examine this hypothesis. Kunnari (2002) found only a few
monosyllables in a study of 10 Finnish-speaking children at the rst fty-word
stage and most of them consisted of interjections and onomatopoeic expressions.
The latest geminate
2
studies have brought up the question of the saliency of
the word-medial position. Richardson (1998) showed that typical Finnish
infants are able to distinguish between short and long contrasts (/t/ versus
Geminate template: a model for rst Finnish words 363
/t:/) as early as the age of 6 months. The latest studies suggest that the contrast
in Finnish production may also begin early. Vihman and Velleman (2000) show
that already by the twenty-ve-word point,
3
when the cumulative lexicon is
about fty words, Finnish childrens production is sharply distinguishable
from that of children acquiring languages such as French and English with
no phonological quantity contrast. Kunnari, Nakai, and Vihman (2001) found
that Finnish children begin to differentiate singleton from geminate targets in
production by the end of the one-word period, whereas Japanese children,
although also exposed to a language that makes a quantitative contrast in medial
consonants, begin to distinguish them later. These cross-linguistic differences
may be due to the differences in input frequency; the quantity contrast is nearly
twice as frequent in Finnish as in Japanese (Aoyama 2000; Kunnari et al.,
2001). In addition, the degree of distinctiveness of the contrast in adult speech
has been suggested to explain the difference (see Aoyama 2000). The saliency
of the medial position in a word, in particular a word with a geminate structure,
receives further support from Finnish studies of the early child production
of three-syllable words. Although many early words t into the SW1 pattern,
syllables with a geminate stop may be included irrespective of position, for
example, [uk:] /lusik:/ spoon, [k:] /n:ik/ Annika (Savinainen-
Makkonen 2000b, c, 2001). Instead of syllables themselves, we should pay
attention to the medial-geminate (C)VC:V structure as a template pattern.
We examine the phenomena mentioned above with the help of a case study
and raise the question of how strong the CV syllable is and how useful the
CVCV structure is as a description of the rst words when we view it through
the lens of Finnish data.
Participant and data collection
Joel is the second of two children in a monolingual Finnish-speaking family.
His parents are university graduates. Joel has a sister who is three and a half
years older. Joels motor skills have developed well. He produced his rst word
at 1;1 and his later language skills developed typically. The data here consist of
Joels rst 50 words (see Appendix), which were entered into a diary by a parent
who is a speech and language pathologist. All the words were produced
spontaneously and were transcribed immediately using IPA.
Results
It took Joel approximately six months to acquire his rst 50 words (1;11;7),
which corresponds to earlier studies (Fenson, Dale, Reznick, Bates, Thal, and
Pethick 1994; Lyytinen 1999). Out of Joels 50 target words, 47 (94 percent)
contain two syllables; this includes the word /hei-hei/ bye-bye (1 in the
Appendix), which is a reduplication of a monosyllabic word /hei/ bye. This
corresponds to the structure of basic forms in Finnish: the most frequent is a
364 Tuula Savinainen-Makkonen
structure containing two syllables (Karlsson 1983: 215). Joel produced all the
disyllabic target words as disyllabic except for the word /pl:o/ [pm] ball (9), in
which the /l/ is not realized and the word is reduced. There are no forms that have
only a simple CV structure. As in the study by Saaristo-Helin, Savinainen-
Makkonen, and Kunnari (2006) of seventeen Finnish-speaking children (mean
age 1;8) in which 96 percent of two-syllable targets were realized as disyllables,
Joel seems to omit the hypothesized Core Syllable stage of English- and Dutch-
speaking children, in which the grammar is constrained to produce no more than
one syllable (Demuth and Fee 1995). Three (6 percent) of Joels target words were
polysyllabic, which is in accordance with earlier Finnish studies (e.g., Kunnari
2000). Although a typical Finnish word in its basic form is bisyllabic, inected
words usually have three or more syllables, with the result that children learning
Finnish attempt to produce trisyllabic and even longer targets already at the stage
of the rst 50 words. In Savinainen-Makkonens (2001) study of six children, the
rate of attempts at polysyllabic words was 8 percent, and in Kunnaris (2000) study
of ten children the rate of attempts was 9 percent at the 25-word point (cumulative
lexicon of about 50 words). These early-targeted long words generally suffer from
reduction (see Kunnari 2000, 2002; Kunnari and Savinainen-Makkonen 1999;
Savinainen-Makkonen 2000b, 2000c, 2001). Joel adapted different strategies to
produce trisyllabic words, such as reduplication (30, 36), truncation (31, 36), and
pausing (30). Each of the examples indicates the childs rendition of the word(s)
and the phonological shape of the target word, followed by its English translation
(the written forms of the words are given in the Appendix).
[kotiti], [kt:iti], [ko(.)titi] /trktori/ tractor(30)
[tui], [ui] /lusik:/ spoon (31)
[mnini] /mndri:ni/ mandarin (orange) (36)
All these phonological processes have been reported in child phonology; for
example: in English (Grunwell 1987; Ingram, 1989), Swedish (Nettelbladt
1983), and Finnish (Savinainen-Makkonen 2000b; Turunen 2003).
All Joels target words end in an open syllable. This is also how he produces
these targets, with the exception of the form [pm] /pl:o/ ball (9). In Finnish,
basic forms ending in closed syllables are rare.
4
About 40 percent of Joels words
started with vowels and about 60 percent with consonants. Out of the eleven
Finnish consonants /p t k m n r l s h j/ that appear word-initially, Joels rst
words may begin with oral stops /p, t, k/ or nasals /m, n/, and /w/. The /w/ does not
appear in adult Finnish, but like many young Finnish children, Joel used it as an
attempt to produce the labiodental //. Word-initial consonant clusters appear in
Finnish only in loanwords, which Joel did not yet target, with the exception of /
trktori/ tractor (30).
Out of the thirteen Finnish consonants /p t d k m n r l s h j/ that appear
word-medially Joels rst words included /p, t, k, m, n, s, w/. The fricative /s/
was targeted word medially in three different words: [ii] /isi/ daddy (11),
[tui], [ui] /lusik:/ spoon (31), and [tuti] /bus:i/ bus (49). There are only
two fricatives in standard Finnish (/h/ and /s/ voiceless, alveolar, fricative) and
therefore /s/ can vary extensively. In this study no attempt was made to tran-
scribe accurately the different occurrences of /s/, and [] was chosen to best
represent the /s/ phoneme. Both non-Finnish consonants /w/ and [] can be
attributed to developing phonetic accuracy. Among English children, one of the
earliest acquired phonemes /d/ is highly restricted in Finnish and therefore it is
often the latest acquired consonant among Finnish children (Toivainen 1997).
The medial position of the word is the most interesting. Out of Joels 50 rst
word forms, 37 (74 percent) have a geminate structure. Many of these (20 percent)
are simply cases of modeling the simple (reduplicative) (C)V(:)C:V(:) input words:
[kk:] /kk:/ poo-poo (7) [kuk:] /kuk:/ ower (8)
[uk:i] /uk:i/ grandpa (25) [kik::] /ki:k::/ swing (38)
[uk:o] [uk:] /uk:o/ old man (39)
[tut:i] /tut:i/ dummy(13) [ot::] /ot::/ take (48)
[mum:i] /mum:i/ grandma (24)
[n:] /n: / give (2)
[ww:], [wuw] /u: / baby (50)
In addition to these correctly produced forms there are several simplied
(one-consonant type) words achieved by the omission of a word-initial conso-
nant (/t/, /h/, /l/, //):
[ip:o] /tip:u/ fell (23)
[ep:], [pep:] /hep:/ horsie (28) [t:u] /ht:u/ hat(26)
[p:u] /lop:u/ all gone (4) [en:u] /len:u/ Lennu (46)
[et:] /et: / water (5)
In these omission cases the word structure is VC:V, showing that onsets are not
obligatory. In addition to these geminate examples are some other word types
that also omit the word-initial consonant (/m/, /n/, /s/, /h/, /l/):
[en:i], [ni] /meni/ went (27)
[mi], [mmi], [mm:i] /nmi/ candy (33)
[io] /sisko/ sister (45)
[eiei] /heihei/ bye-bye (1)
[e:p:], [p:] /leip:/ bread (37)
Several simplied forms achieved by (regressive) assimilation are also present:
[pp:] /ip:/ napkin (6) [pep:], [ep:] /hep:/ horsie (28)
[pp:y] /lp:u/ bib (18)
[pip:u] /lip:u/ ag (43) [p:p:] /s:p::t/ boots(40)
[tt:i] /rt:i/ cloth (34)
[kik:i] /rik:i/ broken (14)
Assimilation always involves the assimilation of an alveolar to a nonalveolar
consonant. Since the liquids (/r, l/), the semivowel /v/, and the fricative /h/ are
not yet part of the childs inventory, they are especially prone to assimilation or
omission.
However, not all forms can be explained so simply. Accommodation of diverse
adult forms to a single preferred output pattern, a behavior typical of what Menn
(1983) calls template matching, can also be found. Joel has applied the geminate
template to some adult words where it is not part of the words structure:
[t:i], [ti], [ti] /iti/ mother (3)
[k:i] /uki/ open(21)
[en:i], [ni] /meni/ went (27)
[mm:i], [mmi], [mi] /nmi/ candy(33)
In the rst two cases (3, 21) Joel is producing a geminate instead of the diphthongs.
In two last cases (27, 33), alongside the forms with the correct (short) quantity there
are forms with just a simple prolongation of a single consonant so that a geminate
structure is produced. In addition to the geminate target words mentioned above,
there are seven target words with medial consonant sequences (consisting of
consonants that appear within different syllables) C
1
C
2
(:). In Finnish there are
more than 50 word-medial consonant sequences, which are a challenge for chil-
dren. As with many children at the rst-word stage, Joel has a constraint that
produces only singleton medial consonants, so he produces just one consonant of
the consonant sequences (excluding [io] /sisko/ sister (45), with no consonants):
[ken:] /kek/ shoe (15)
[k:] /k:/ duck (20)
[kot:i] /kort:i/ card (17)
[it:i] /irti/ off (44)
[kek:i] /keksi/ biscuit (29)
[ot::] /nost:/ carry (47)
These types of forms, produced by the reduction of the medial consonant
sequence, are very often produced by 1- and 2-year-old Finnish children.
However, Joels form [it:i] /irti/ (44) can be counted as an example of geminate
template matching since Finnish children avoid the /r/ slot, more often lling it
with (compensatory) lengthening of the preceding vowel, which is here /i/ ([i:ti])
(Savinainen-Makkonen and Kunnari 2004).
The general early tendency (Stoel-Gammon 2002) to prefer words with redu-
plicated syllables as well as to (over)produce identical consonants was clearly
shown above. Taken together, out of the rst 50 words there are as many as 34
(68 percent) words with one consonant type only. There are, however, words with
two consonant types correctly realized. Although Joel seems to be able to produce a
velar and dental combination, e.g., [ki:t:i], [kit:i] /ki:t:i/ thanks (12), [ken:]
/kek / shoe (15), and two different dentals, [tn:e] /tn:e/ here (illative)
(16), /t/ is omitted in the form [ip:o] /tip:u/ fell down (23). This form, along with
other forms that have the feature of consonantal labiality, i.e., [pm] /pl:o/ ball
(9), [mi], [mmi], [mm:i] /nmi/ candy(33), seems to refer to a specic
constraint which precludes the co-occurrence of other consonantal features when
consonantal labiality is realized. Towards the end of the 50-word period this
constraint will also cease and labial features can be combined with other features:
[k:p:i:] /k:p:i:(n)/ (into the) cupboard (illative) (35)
[mnini] /mndri:ni/ mandarin (orange) (36)
[kup:i] /kup:i/ cup (41)
Discussion
Joels data do not support the strong dominant position of the CV(CV) structure
or the saliency of the word-initial position. In Joels case, the same phonemes
were generally found in both initial position and medial position; neither
position was stronger. Word-initial position, however, does seem weaker in
other ways: both assimilations and omissions characterize word-initial conso-
nants. Indeed, word-initial consonant omission is found to be a common process
during the early stages of speech development in Finnish (Kunnari 2000;
Savinainen-Makkonen 2000a, 2001). Many early child forms do have the
structure CVC:V, but often also VC:V, like Joels [et:] /et:/ water (how-
ever, not [tet:]). Viewed from this perspective the CV structure is fairly
unstable during the early stages of development for children learning Finnish.
It would seem that at least for some children acquiring Finnish the medial
geminate template is the most salient part of a word and the initial consonant
only an optional segment, despite the fact that the rst syllable always carries
word stress in Finnish. Instead of the word stress contributing to saliency
(prominence), quantity as a suprasegmental feature seems to be at work here.
If we look at Joels target words, we nd the geminate structure in 29 cases
(58 percent). Unfortunately, we do not have statistics on geminate words in
general in standard Finnish. However, Vainios (1996) statistics on phoneme
frequencies show that taken together all Finnish long phonemes (both conso-
nants and vowels) make up only 10 percent of all phonemes. Do the targets Joel
attempted result from selectivity? Has he sought out adult words with a particu-
lar structure? There are only minor references to early Finnish target words.
In Kunnaris (1997) list of the most popular early targets (/iti/ mother, /n:/
give, /hu:/ doggie, /kk:/ poo-poo, /ki:k:u:/ swing, /mito/ milk,
/mum:u/ grandma, and /tut:i/ dummy) among 10 Finnish-speaking children,
geminate words are common. Moreover, in Finnish, child-directed speech
structures, such as CVC:V, CV:CV and CV:C:V, are preferred (Toivainen
1994), so that for example words which potentially could have a word-nal
consonant are excluded, e.g., /hep:/ horsie is preferred to /hevonen/ horse,
and /pos:u/ piggie to /porss/ pig. It is possible that geminate words are
overrepresented in many Finnish childrens early targets. So far the only study
that has examined the early target words of Finnish children was done by
Vihman and Velleman (2000), and it analyzed Kunnaris (2000) data from
ve pairs of a parent and a child at the 25-word point. The content words the
ve mothers used had 38 percent geminates. The words the children targeted
consisted of 49 percent geminate consonants, and the forms they produced had
as much as 55 percent. According to this small sample, geminate structures
seem very general both in adult models and in child forms. However, Joel
produces them even more. It is possible that a geminate template might be a
child Finnish pattern that Joel found especially congenial. Interestingly, in
Savinainen-Makkonens (2000b, 2000c) study, the boy Antti used the geminate-
template strategy for long words, producing forms like [uk:] /lusik:/ spoon.
It is often difcult to compare the data of the early stages of speech develop-
ment because of differences, for example, in the criteria for the denition of a
word and also due to transcription problems. It is even more difcult to interpret
the results when two languages which differ in many respects are compared, as
in the case of English and Finnish. In studies that measure the strength of
positions in words, signicant differences may be seen in the length of words
with regard to syllables, which makes the interpretation of results difcult.
However, as the CV-structure cannot usually describe the forms produced by
Finnish children, the syllable may not be the most important unit of early speech
development. The phonology of Finnish children may be fundamentally organ-
ized at the whole-word level.
notes
1. Finnish has a very rich morphophonology. The basic principle of word formation is
the addition of sufxes to stems. However, the sufxes are not attached mechanically
to the stem. There are several kinds of vowel and consonant changes, triggered by the
addition of sufxes, so that a word may be represented by different stems depending
upon which sufxes are included in it.
2. Finnish has phonemic consonant and vowel length; both consonants and vowels can
be either long or short. Differences in quantity indicate differences in meaning, as for
example in /kuk/ /kuk: / who ower. A long consonant is called a geminate.
3. The 25-word point (dened as the rst 30-minute recording session in which 25
different identiable word types were used spontaneously) refers to a cumulative
lexicon of 50 words; see Vihman, Ferguson, and Elbert 1986; Vihman and Miller 1988.
4. There are relatively few words which end in a consonant in Finnish; only the dentals
/t, s, l, r, n/ appear in nal position, and liquid-nal (/r, l/) disyllables are rare. In
marginal words and onomatopoeia one can nd other types of consonants word-
nally (e.g., huh oh) and even word-nal consonant clusters (e.g., hups oops),
which do not exist in standard Finnish. Karlsson (1983) estimates that there are 100
nominals ending in a consonant, whereas 30,000 end in a vowel. Finnish words are
constructed by attaching morphemes to the base form. In inected word forms word-
nal consonants are common. One can nd word-nal consonants in the plural form,
e.g., tyt+t girls and in the genitive singular, e.g., tyt+n girls and as part of the
marker for the illative case, e.g., koti+in to home.
References
Allen, G. D. and Hawkins, S. (1978). The development of phonological rhythm. In
A. Bell and J. B. Hooper (eds.), Syllables and segments, pp. 17385. Amsterdam:
North-Holland.
(1980). Phonological rhythm: denition and development. In G. Yeni-Komshian,
J. Kavanagh, and C. Ferguson (eds), Child phonology, vol. 1: Production,
Aoyama, K. (2000). A psycholinguistic perspective on Finnish and Japanese prosody:
perception, production and child acquisition of consonantal quantity distinctions.
Unpublished PhD dissertation, University of Hawaii.
Bernhardt, B. and Stemberger, J. P. (1998). Handbook of phonological development: from
the perspective of constraint-based nonlinear phonology. San Diego: Academic Press.
Chin, S. and Dinnsen, D. (1992). Consonant clusters in disordered speech: constraints
and correspondence patterns. Journal of Child Language, 19, 25985.
Demuth, K. and Fee, J. (1995). Minimal words in early phonological development.
Unpublished MS, Brown University and Dalhousie University.
Dinnsen, D. A. (1996). Context effects in the acquisition of fricatives. In B. Bernhardt,
J. Gilbert, and D. Ingram (eds.), Proceedings of the UBC International Conference
on Phonological Acquisition, pp. 13648. Somerville, MA: Cascadilla Press.
Echols, C., Crowhurst, M. J., and Childers, J. B. (1997). The perception of rhythmic units
in speech by children and adults. Journal of Memory and Language, 36, 20225.
Edwards, M. L. (1979). Word-position in fricative acquisition. Papers and Reports on
(1996). Word position effects in the production of fricatives. In B. Bernhardt, J. Gilbert,
and D. Ingram (eds.), Proceedings of the UBC International Conference on
Phonological Acquisition, pp. 14958. Somerville, MA: Cascadilla Press.
Fenson, L., Dale, P. S., Reznick, J. S., Bates, E., Thal, D., and Pethick, S. J. (1994).
Variability in early communicative development. Monographs of the Society for
Research in Child Development, 59 (5, serial no. 242).
Gerken, L. (1994). A metrical template account of childrens weak syllable omissions
from multi-syllabic words. Journal of Child Language, 21, 56584.
Grunwell, P. (1985). Phonological assessment of child speech. Windsor: NFER Nelson.
(1987). Clinical phonology, 2nd edn. London: Chapman & Hall.
Howell, J. and Dean, E. (1994). Treating phonological disorders in children: Metaphon
theory to practice, 2nd edn. London: Whurr.
Ingram, D. (1989). Phonological disability in children, 2nd edn. London: Whurr.
(1999). Phonological acquisition. In M. Barrett (ed.), The development of language,
pp. 7397. Hove, UK: Psychology Press.
Jakobson, R. (1968). Child language, aphasia and phonological universals, trans.
A. Keiler. The Hague and Paris: Mouton. (Originally published as Kindersprache,
Aphasie und allgemeine Lautgesetze. Uppsala: Almqvist & Wiksell, 1941.)
Johnson, J. S., Lewis, L. B., and Hogan, J. C. (1997). A production limitation in syllable
number: a longitudinal study of one childs early vocabulary. Journal of Child
Jusczyk, P., Cutler, A., and Redanz, N. J. (1993). Infants preference for the predominant
stress patterns of English words. Child Development, 64, 67587.
Karlsson, F. (1983). Suomen kielen nne-ja muotorakenne [Finnish phonology and
morphology]. Porvoo, Helsinki, and Juva: WSOY.
Kehoe, M. and Stoel-Gammon, C. (1997). Truncation patterns in English-speaking
childrens word production. Journal of Speech, Language and Hearing Research,
40, 52641.
Kunnari, S. (1997). Fonologisen kehityksen varhaisvaiheet [Early phonological devel-
opment]. Suomen logopedis-foniatrinen aikakauslehti, 17, 338.
(2000). Characteristics of early lexical and phonological development in children
acquiring Finnish, PhD dissertation, University of Oulu (Acta Universitatis
Ouluensis). Oulu University Press.
(2002). Word length in syllables: evidence from early word production on Finnish.
First Language, 22, 11935.
Kunnari, S., Nakai, S., and Vihman, M. M. (2001). Cross-linguistic evidence for acquis-
ition of geminates. Psychology of Language and Communication, 5, 1324.
Kunnari, S. and Savinainen-Makkonen, T. (1999). Production of three-syllable words.
Poster presented at the VIIIth International Congress for the Study of Child
Language, San Sebastian.
Lyytinen, P. (1999). Varhaisen kommunikaation ja kielen kehityksen arviointimenetelm
[Finnish manual for communicative development inventories]. Jyvskyl: Niilo
Mki Instituutti.
Macken, M. (1993). Developmental changes in the acquisition of phonology. In B. de
Boysson-Bardies et al. (eds.), Developmental neurocognition: speech and face
processing in the rst year of life, pp. 43550. Dordrecht: Kluwer.
(1983). Development of articulatory, phonetic and phonological capabilities. In
Nettelbladt, U. (1983). Developmental studies of dysphonology in children, PhD dis-
sertation, University of Lund (Travaux de linstitut de linguistique de Lund). Lund:
CWK Gleerup.
Richardson, U. (1998). Familial dyslexia and sound duration in the quantity distinctions
of Finnish infants and adults, PhD dissertation, University of Jyvskyl. Jyvskyl:
Studia Philologica Jyvskylensia.
Saaristo-Helin, K., Savinainen-Makkonen, T., and Kunnari, S. (2006). The phonological
mean length of utterances: methodological challenges from a crosslinguistic per-
spective. Journal of Child Language, 33, 17990.
Savinainen-Makkonen, T. (2000a). Word-initial consonant omissions a developmental
process in children learning Finnish. First Language, 20, 16185.
(2000b). Learning long words a typological perspective. Language and Speech, 42
(2), 20525.
(2000c). Learning to produce three-syllable words: a longitudinal study of Finnish
twins. In M. Perkins and S. Howard (eds.), Newdirections in language development
and disorders, pp. 22331. New York: Plenum Publishing.
(2001). Suomalainen lapsi fonologiaa omaksumassa [Finnish children acquiring pho-
nology]. PhD dissertation, University of Helsinki (Publications of the Department of
Phonetics 42).
Savinainen-Makkonen, T. and Kunnari, S. (2004). Systemaattisen kauden rajoitukset
ja fonologiset prosessit [Constraints and phonological processes after the rst
words stage]. In S. Kunnari and T. Savinainen-Makkonen (eds.), Mist on
pienten sanat tehty? [From what are childrens words made up?], pp. 99109.
Helsinki: WSOY.
Stampe, D. (1969). The acquisition of phonetic representation. Proceedings of the Fifth
Regional Meeting of the Chicago Linguistic Society, 27, 43344.
(1979). A dissertation on natural phonology. New York: Garland.
Stoel-Gammon, C. (1985). Phonetic inventories, 1524 months: a longitudinal study.
Journal of Speech and Hearing Disorders, 53, 30215.
(1996). Phonological assessment using a hierarchical framework. In K. N. Cole,
P. S. Dale, and D. J. Thal (eds.), Assessment of communication and language,
pp. 7796. Baltimore: Paul Brookes.
(2002). Intervocalic consonants in the speech of typically developing children:
emergence and early use. Clinical Linguistics and Phonetics, 16, 15568.
Toivainen, J. (1997). The acquisition of Finnish. In D. I. Slobin (ed.), The crosslinguistic
study of language acquisition, pp. 87182. Mahwah, NJ: Erlbaum.
Toivainen, K. (1994). Hoivakielen tutkimuksesta ja suomen murteiden hoivasanastosta
[On research of babytalk register and on babytalk words in the dialects of Finnish].
In K. Toivainen and J. Toivainen (eds.), Ensikielen suomalaiskieli [Finno-Ugric
language as a rst language] (Publications of the Department of Finnish and
General Linguistics of the University of Turku, 42), 429.
Tuomi, T. (1980). Reverse dictionary of modern standard Finnish, 2nd edn. Hmeenlinna:
SKS.
Turunen, P. (2003). Production of word structure: a constraint-based study of 2;6 year
old Finnish children at-risk for dyslexia and their controls, PhD dissertation,
University of Jyvskyl (Jyvskyl Studies in Languages 52).
Vainio, M. (1996). Phoneme frequencies in Finnish text and speech. In A. Iivonen and
A. Klippi (eds.), Studies in logopedics and phonetics 5 (Publications of the
Department of Phonetics, University of Helsinki, Series B: Phonetics, Logopedics
and Speech Communication, 6), 18194.
Vihman, M. M. (1991). Ontogeny of phonetic gestures: speech production. In
I. G. Mattingly and M. Studdert-Kennedy (eds.), Universals in human language,
pp. 6990. Stanford University Press.
Pyscholinguistics, 7, 340.
development of linguistic vocabulary, pp. 15183. New York: Academic Press.
Vihman, M. M. and Velleman, S. L. (2000). The construction of a rst phonology.
Wijnen, F., Krikhaar, E., and Den Os, E. (1994). The (non)realization of unstressed
elements in childrens utterances: evidence for a rhythmic constraint. Journal of
Appendix: Joels rst 50 words
No. Age Diary form
Adult form: phonemic
and orthographic Meaning
1 1;1.0 [eiei] /heihei/ hei hei bye-bye
2 1;2.0 [n:] /n:/ anna give
3 1;2.14 (1;5.14) [ti]
[t:i], [ti]
/iti/ iti mother
4 1;3.7 [p:u] /lop:u/ loppu allgone
5 1;3.14 [et: ] /et:/ vett water
6 1;4.14 [pp:] /aip:a/ vaippa napkin
7 [kk: ] /kak:/ kakk poo-poo
8 1;4.21 [kuk:] /kuk:/ kukka ower
9 [pm] /pl:o/ pallo ball
10 1;5.14 [tui:], [tui] /syli:n/ syliin lap*
11 1;6.0 [ii] /isi/ isi daddy
12 [ki:t:i], [kit:i] /ki:t:i/ kiitti thanks
13 [tut:i] /tut:i/ tutti dummy
14 [kik:i] /rik:i/ rikki broken
15 1;6.7 [ken:] /kek/ kenk shoe
16 [tn:e] /tn:e/ tnne here*
17 1;6.14 [kot:i] /kort:i/ kortti card
18 [pp:y] /lap:u/ lappu bib
19 [no:n] /no:n/ Noona Noona (name)
20 [k:] /k:a/ ankka duck
21 [k:i] /uki/ auki open
22 [ton:e] /ton:e/ tonne there*
23 [ip:o] /tip:u/ tippu fell
24 [mum:i] /mum:i/ mummi grandma
25 [uk:il] /uk:i/ ukki grandpa
26 [t:u] /ht:u/ hattu hat
27 [en:i], [ni] /meni/ meni went
28 [ep:], [pep:] /hep:/ heppa horsie
29 [kek:i] /keksi/ keksi biscuit
30 1;6.21 [kotiti], [kt:iti],
[ko(.)titi]
/traktori/ traktori tractor
31 [tui], [ui] /lusik:/ lusikka spoon
32 [kti] /ksi/ ksi hand
33 [mi], [mmi],
[mm:i]
/nmi/ nami candy
34 [tt:i], [t:i] /rt:i/ rtti cloth
35 [k:p:i:] /k:p:i:(n)/ kaappiin cupboard*
36 [mnini] /mndri:ni/ mandariini mandarin
(orange)
37 [e:p:], [p:] /leip:/ leip bread
38 [kik::] /ki:k::/ kiikkaa swing
39 [uk:o], [uk:] /uk:o/ ukko old man
40 [p:p:] /s:p::t/ saappaat boots
41 [kup:i] /kup:i/ kuppi cup
42 1;7.0 [tti] /tti/ tti lady
43 [pip:u] /lip:u/ lippu ag
44 [it:i] /irti/ irti off
45 [i o] /sisko/ sisko sister
46 [en:u] /len:u/ Lennu Lennu (name)
47 [ot::] /nost:/ nostaa carry
48 1;7.7 [ot::] /otta:/ ottaa take
49 [tuti] /bus:i/ bussi bus
50 1;7.14 [ww:], [wuw] /u:/ vauva baby
* illative
14 Inuence of geminate structure on early Arabic
templatic patterns
Ghada Khattab and Jalal Al-Tamimi
This chapter reports on the early development of phonology during the
one-word stage in ve Lebanese children, paying particular attention to the
inuence of the adult phonology as well as to the childrens individual journeys
towards adultlike patterns. The study contributes to two of the main aims of this
volume: rst, it shows that early word shapes by Lebanese-speaking children do
not follow a straightforward developmental track from simple to complex
structures; rather, individual preferences in early productions and the frequency
or prominence of particular structures in the adult phonology play a major role
in shaping the phonological structure of words in the second year of life.
Second, the study sheds more light on the so-called U-shaped curve in develop-
ment whereby children may have accurate forms in their production at an early
stage of development but later regress; in this study this is scrutinized from
the point of view of the acquisition of phonological length in consonants and it
is suggested that accurate forms before and after a regression stage may be
qualitatively different, with only the latter showing real acquisition of adult-
like phonological structure.
1 The emergence of phonology and the role of cross-linguistic
differences
As shown by various contributions to this volume, children build their phono-
logical knowledge from an initially small repertoire of words that may occur
frequently in their input, attract their attention, and contain sounds that are
part of their babbling and early word practice; subsequently, their attempts at
producing these words gain the attention of caregivers who potentially repeat
the words to the children. The phonological structure of these words may
inuence the childrens subsequent selection of adult targets, as well as lead
to adaptation of phonologically distant targets to that same structure, resulting in
productive template use. The structure of words in the childs own rst lexicon,
together with segmental and phonological patterns of the adult language, are
jointly responsible for the shape of the templates and for individual differences
in childrens templatic shapes (Vihman and Croft 2007: 707). Below we unpick
some of these seminal ideas and look at cross-linguistic effects on childrens
early words.
374
In a whole-word account, the child may group phonetically related words
together and acquire word shapes or word patterns as the basic units
(Ferguson and Farwell 1975; Menn 1983; Macken 1979). Children are
typically highly variable in their word production in the early stages, suggest-
ing a lack of command over individual sounds within these words and/or a
lack of abstract categorical knowledge of the sounds within them (Vihman
and Croft 2007: 689). Adaptations of adult words to the childs preferred
templatic shapes reveal the relationship between groups of words in the
childs lexicon and offer a window into the way children deal with challenges
with respect to particular sounds or sound sequences. Evidence for the
emergence and development of templatic behavior in a childs lexicon
includes: (a) consistency of patterning in several of the childs words
produced over several sessions; (b) occurrence of unusual phonological
correspondences between adult and child forms due to the inuence of the
template; and (c) a sharp increase in words that t the template (Vihman and
Croft 2007: 6945). More phoneme-like categorization may appear when
reorganisation of word shapes and units takes place, though the child
may still use some of the preferred sounds from their early prosodic units
(Macken 1979: 34).
While early words may be similar cross-linguistically, the phonology of
each adult language that is, the ambient language shapes inuences the
rst phonological templates that emerge out of these shapes and that start to
be applied to new words which are beyond the childs range (Vihman and
Croft 2007: 692). For instance, examples of English-speaking childrens
templates include monosyllables with nal nasals ([CVN]) or trochaic disyl-
lables with child-specic consonant or vowel components, e.g. [C
1
VC
2
V],
[CVjV], or [CV
low
CV
high
] (Macken 1979; Priestly 1977; Vihman, Velleman,
and McCune 1994). French-speaking childrens templates, on the other hand,
tend to follow a language-specic prosodic shape, [ ()
n
], with a nal
stressed syllable, a counter-stress on the initial syllable, and up to two
optional syllables in between (Wauquier and Yamaguchi this volume), e.g.,
[a-o] template as in [ato] for /bato/ ship; [afo] for /elef/ elephant; and
[abal
] for /
bal
/ one balloon. In Estonian, Vihman and Vihman (2011)

nd a C
0
Vi/jV template (where C
0
represents an optional C), with a palatal
medial glide that is more consistent than the initial C, which is often omitted.
Note that while medial glides have been reported to occur as part of English
templates as well (Priestly 1977), the prosody of each language inuences
the way other segments in the templatic structure are realized, as exemplied
by the initial consonant omission in Estonian but not English. The promi-
nence of medial position in Estonian has been discussed in studies on
languages with a quantitative length distinction, where gemination may
further attract the childs attention to medial consonants at the expense of
initial ones. These studies are discussed next.
Inuence of geminate structure on early Arabic templatic patterns 375
2 The role of geminate structure in shaping early words
Waterson (1971: 181) suggests that in the early stages of production, the child
may produce only those features of the adult target that they can perceive and
easily reproduce. Long consonants fall into the category of sounds which
must at the same time be salient in the input, due to their prominent duration
(alongside non-durational cues, e.g., Al-Tamimi and Khattab 2011, 2012), and
relatively easy to produce, since childrens early articulations are slow (e.g.,
Stoel-Gammon and Cooper 1984). While the childs long phonetic durations
in the early stages of production do not necessarily translate into contrastive
acquisition of segmental length, that early practice must provide a stepping
stone for later internalization of length as a phonological feature. Whereas for
languages like English the starting point for the childs production pattern is
often considered to be CV(CV), a phenomenon referred to as the core syllable
stage (Demuth 1995; Fee 1995; Fikkert 1994), children acquiring languages
with quantitative medial contrasts have been shown to exhibit different early
patterns. For instance, while English has a dominant trochaic pattern with a
louder, higher-pitched rst syllable, Finnish (which is also consistently tro-
chaic) has many medial geminates which may be inherently salient for children,
as mentioned above, and which may also attract their attention due to their
frequency in child-directed speech, as can be seen from the relatively high
number of medial geminates the children aim for and produce, regardless of
target (Vihman and Velleman 2000).
The prominence of the geminate structure in the language has led some
researchers to suggest that CVC:V, rather than CV(CV), is the starting point
for Finnish children. For example, Savinainen-Makkonen (2007: 346) looks at
data from a Finnish child, Joel, between the ages of 1 and 1;6 and nds that the
majority of his utterances (47 out of his rst 50 words) have a disyllabic
structure. Furthermore, it is medial gemination rather than stress that seems to
govern what is deleted and what is retained in the productions of Finnish
children, who tend to omit initial consonants in trochaic shapes while showing
more accurate production of medial consonants. Similar results are reported by
Vihman and Velleman (2000), who were surprised to nd that the second most
common pattern in their Finnish data (after consonant harmony) was onset
deletion (31 percent, both selected and adapted), a pattern considered to be a
sign of deviant phonology in English. Similar patterns have also been found for
a child acquiring Hindi (Bhaya Nair 1991), where deletion of onsets is present in
many disyllabic Hindi words with medial clusters or geminates.
Finnish and Arabic share common phonological patterns in the adult lan-
guage, including phonemic consonant and vowel length and rich morphopho-
nology, leading to multisyllabic words being frequent in the input due to the
addition of various sufxes to stems. One notable difference relates to the status
of initial consonants, with the phonology of Arabic disallowing onsetless
syllables (Watson 2002: 56). We were therefore interested to determine whether
376 Ghada Khattab and Jalal Al-Tamimi
children acquiring Arabic show similar patterns to children acquiring Estonian,
Finnish, Hindi, and other languages with gemination. What we found surprising
was that, within the scarce literature on phonological development in Arabic,
the acquisition of gemination has not been dealt with in any detail. In the next
two sections we present an overview of relevant aspects of Arabic phonology
before exploring ndings from cross-linguistic acquisition studies which chal-
lenge the Anglo-centric claims about the salience of initial consonants and the
typical patterns of acquisition of syllable structures.
3 Gemination and other relevant characteristics of Arabic
phonology
Arabic has a complex root-and-pattern (or nonconcatenative) morphology
(Watson 2002; McCarthy and Prince 1990a,b). In Arabic linguistics this system
is coincidentally also referred to as templatic (although this usage should not
be confused with the terminology used in this volume to refer to developmental
processes). The stem of a content word in Arabic has three discontinuous
morphemes:
(a) the consonantal root (e.g., k, t, b), which is the underlying lexical unit of the
language that conveys semantic information (in the example here k, t, b
relates to writing);
(b) the templatic pattern into which the consonantal root is inserted, adding
morphosyntactic and phonological information to the root (e.g., the word
pattern referred to by Arabic linguists as faal expresses the past tense,
whereby /f/, //, and /l/ are placeholders for each of the consonants in the
root, thus /katab/ he wrote);
(c) the interpolated vowels, which signal changes in voice (active or passive in
verbs), agent relations in nouns derived from verbs, and singularplural
relations in nouns (e.g., /kutib/ it was written; /kaatib/ writer; /kutub/
books).
Whether Arabic speakers acquire and store whole stems or individual compo-
nents of their lexicon (roots, templates, and melodies) is a matter of debate (e.g.,
Boudelaa and Marslen-Wilson 2001, 2004; Ravid 2002) and is beyond the
scope of this study. What is of interest here is the wide range of resulting word
shapes that Arabic-speaking children are exposed to, many with a nal coda.
Although a lot more work is needed on deriving frequency of occurrence of
various templatic shapes in both the adult lexicon and child-directed speech, the
three most commonly occurring shapes tend to be: (1) CVCV(C) for nouns
(e.g., /abal/ mountain; /dawa/ medicine) or form I present perfect faal
verbs (e.g., /daras/ he studied; /katab/ he wrote, etc.); (2) CVC:V(C) for
form II causative faal verbs (e.g., /darras/ he taught; /kattab/ he made
someone write, etc.) or nouns (e.g., /batta/ duck); and (3) CV:CV(C) for
nouns and form III active participles or nouns (e.g., /waadi/ valley; /saadid/
having blocked). Out of the ten triliteral verb templates in Arabic, form II with
the geminate consonant is the most productive and the most common in modern
Arabic dialects (Watson 2002: 134). Medial gemination is also used in the
derivation of nouns of profession from form II verbs, resulting in an iambic
CVC:V:C shape, e.g., /xabba:z/ baker.
In terms of syllable structure, disyllables are much more common than
monosyllables, with nine out of the ten triliteral verb forms having a disyllabic
structure and the majority of nouns having disyllabic or trisyllabic structure
(Watson 2002: 13465). The majority of disyllabic verbs have a trochaic stress
pattern, while nouns can be iambic or trochaic. Arabic is also a quantity-
sensitive language, with the mora playing an important role in syllable weight
(McCarthy and Prince 1986; Hayes 1989). The minimal word is thought to be
bimoraic, i.e., either consisting of a monosyllabic word with two vowels or a
coda (CV: or CVC), or a disyllabic CVCV word (Broselow 1992; McCarthy
and Prince 1986; 1990a,b). Syllable types in Lebanese Arabic include: CV (in
nonnal position, e.g. /alam/ pen); CVC (e.g., /sin/ tooth); CV: (e.g., /laa/
no); CV:C (e.g., /be:b/ door); CVCC (e.g., /nahr/ river); CV:CC (e.g.,
/aamm/ public) (Khattab 2007; Nasr 1960 1966; Obrecht 1968). CV is light
and does not occur in monosyllabic words; CVC and CV: are heavy; CV:C,
CVCC, and CV:CC are superheavy. Gemination and vowel length are two main
characteristics of syllable structure, and their weight is unaffected by syllable
position. Each vowel or geminate consonant has one mora, while singleton
consonants acquire weight by position, with onset consonants being weightless
and nal consonants extra-metrical (Watson 2002: 54).
At the segmental level, all Arabic consonants can be geminated, which for
Lebanese Arabic means at least 28 consonants (Table 14.1). Many disyllabic
French loanwords are also pronounced with a long medial consonant in the
French accent in Lebanon (e.g., tape /tap/ [tapp] clap; papa /papa/ [pap
pa] daddy, etc.), contributing to the high frequency of words with long
medial consonants in the adult phonology. Vowel length is also contrastive,
with the following impressionistic set for LA (there are no experimental
studies of LA vowels): /i, , e, e, , , , , u, , o, o, , /.
Geminate consonants are about twice as long as their singleton counterparts,
and the same applies to phonologically long vowels in comparison to
short ones (Khattab 2007; Khattab and Al-Tamimi 2008, 2012).
Nondurational cues also play a secondary role in the singleton-geminate
contrast (Al-Tamimi and Khattab 2011, 2012).
4 Studies on the acquisition of Arabic
Given that exposure to frequent prosodic structures in a language may explain
earlier acquisition of these structures in that language, the properties of Arabic
prosodic structure described above suggest the following predictions: Arabic-
speaking children may: (a) produce disyllables early in the acquisition process;
(b) show coda production early; and (c) acquire gemination and complex
Table 14.1. Consonant inventory of Lebanese Arabic (adapted from Khattab 2007)
Bilabial Labio-dental Dental-alveolar Post-alveolar Palatal Velar Uvular Pharyngeal Glottal
Plosive (p) b t d
t d
k () (q)
Nasal m n
Trill r
Tap
Fricative f (v) s z
s z
x h
Approximant
(+ lat. app)
w
(lab-vel.)
l l j
Note: Three of the sounds in brackets occur only in loanwords (/p/, /v/, and //), while /q/ is normally realized as [] in most Lebanese dialects but retained as [q] by the
Druze community and in the Standard variety.
syllables early. These patterns can indeed be found in the data from studies on
the acquisition of Arabic phonology, but they are seldom highlighted or dis-
cussed in any detail, perhaps because it is difcult to reconcile these results with
the often assumed universal sequence of syllable structure acquisition.
Moreover, most studies on phonological acquisition in Arabic, whether large-
scale cross-sectional or small and longitudinal, have mostly looked at the order
of acquisition of consonants and the phonological processes exhibited by
Arabic-speaking children (e.g., Amayreh and Dyson 1998; Ammar and Morsi
2006; Dyson and Amayreh 2000; Saleh, Shoeib, Hegazi, and Pakinam 2007;
Shahin 1995, 2003), though more recent studies have looked at syllable struc-
ture as well (e.g., Abdoh 2011; Ammar 2002; Salem 2000). Here we review
relevant ndings from some of these studies.
In two studies looking at the acquisition of Jordanian Arabic consonants by
children aged 2;0 to 6;4 (across the two studies), Amayreh and Dyson (1998)
and Dyson and Amayreh (2000) found that medial consonants are much more
accurate than initial and nal consonants, with no signicant difference between
initial and nal position. The authors wondered whether this result was inu-
enced by the stress pattern in the words they elicited (Amayreh and Dyson 1998:
651), but a look at the word list in their appendix shows a balanced number of
iambic and trochaic stress patterns. In a parallel study on Egyptian children aged
1;0 to 2;6, using naturalistic data, Saleh et al. (2007) surprisingly found nal
position the most accurate in terms of consonant realization, followed by medial
and lastly initial position, which showed the highest degree of errors in pro-
duction (substitutions and deletions). This was echoed in a study on the
acquisition of consonants in all word positions in twenty-one Palestinian
children aged 1;4 to 2;10 by Shahin (2003), who notes that nal codas were
highly accurate (Shahin adopts a phonologically driven explanation, suggesting
that nal codas are representationally onsets; see Harris and Gussman 1998).
Out of all four word positions, initial, medial onsets, and nal codas were
deemed to be acquired early by the children, while medial codas were acquired
late. While this is not explicitly discussed in the study, the cross-sectional data
showed more accuracy for nal consonants and medial onsets than initial-word
onsets, especially in the youngest age group (Shahin 2003: 917), and develop-
ment followed a nonlinear progression, with dips in accuracy at all ages and a lot
of individual variation.
Abdohs (2011) study is among the few Arabic acquisition studies focusing
more on word shapes than segmental acquisition. The author looked at rst
words in twenty-two Hijazi-speaking children aged 1;01;9 within Prosodic
and Moraic Theory approaches to phonological structure (e.g., McCarthy and
Prince 1990; Hayes 1989). Despite the fact that her data do not fully support the
presumed universal order of acquisition of word structure, Abdoh maintains that
the children in her study follow that order, albeit with a starting point that skips
the monomoraic core syllable stage. The children are said to start at the minimal
word stage where the maximal word size is a single binary foot and their outputs
display bimoraic forms (ages 1;1 to 1;6); at later stages (1;61;9) they are
reported as going beyond this stage and producing forms showing disyllabic
words with a trochaic (SW) or iambic (WS) foot, and more complex structures
exceeding the maximal size, i.e., structures with two feet. However, looking
at the childrens most frequent word shapes in the early stages (Abdoh 2011:
149155), the data show that disyllables constituted 60.9 percent of the child-
rens production, followed by monosyllables at 38.2 percent and then trisyl-
lables at 0.9 percent. When these three word shapes are combined, the
frequency of word types produced is the following: CVCV (29.1 percent) >
CVC > CVC:V > CV:CV > CV (10 percent). Note that coda production is
present from an early age (e.g. /dub/ bear; /ba:b/ door, etc.), despite reported
cases of coda deletion. Gemination is also reported to be acquired early,
particularly in medial position (Abdoh 2011: 149). The author points out that
one reason for this might be that medial geminates often appear in baby talk
(e.g., /dubba/ teddy bear; /dadda/ grandma, etc.). More interestingly, child-
rens truncation patterns seem to preserve nal syllables regardless of stress,
e.g., /fusta:n/ dress realized as [ta:n]; but also /arnab/ rabbit realized as
[nab] and /samaka/ sh realized as [ka].
Similar results regarding the early acquisition of complex syllable structures
were reported by Ammar (2002), whose study of syllable structure in the speech
of ten Egyptian children aged 2;0 to 3;0 found that 90 percent of the children
had acquired all syllable types. Ammar also reports on nal consonant deletion
being accompanied by lengthening of the preceding vowel. Furthermore,
although she and other authors note cluster reduction in all the children up to
age 4, she notes that clusters in CVCC are acquired earlier by Egyptian children
than by English-speaking children (Ammar 1999).
In sum, the results from these studies highlight the inuence of the adult
phonology on Arabic childrens early words in terms of the early acquisition of
medial and nal consonants, complex syllable structures, and the predominance
of disyllables in early words. However, very little mention is made of the
potential role of gemination in shaping Arabic childrens early words and
inuencing their attention to noninitial word positions. Moreover, with most
of the above studies being cross-sectional in design, very little attention has
been paid to individual childrens development of phonology from the earliest
stages of production. The present study therefore aims to ll this gap.
5 Current study
The data presented here are part of a longitudinal study of ten Lebanese
children, ve based in Beirut and ve in London (only the Beirut data are
presented here). The study was carried out as part of an investigation of the
acquisition of gemination by Lebanese-speaking children exposed to Lebanese
Arabic alone and in conjunction with English and/or French. The Beirut-based
families were recruited from the Greater Beirut area, but no further control for
dialect was imposed. The emphasis was on locating families who were mainly
Arabic-speaking (the use of French and/or English alongside Arabic is very
common in Lebanon). The children were primarily cared for by their mothers
and none had started attending nursery in the rst two years of life. The children
were recorded once a month from around 9 months of age until their third
birthday. The recordings used for this chapter are for the sessions where the
children were deemed to be at the 4-word-point (4wp, i.e., when they produced
4 different word types spontaneously in a session) and all subsequent sessions
leading up to the 25-word-point (25wp, when the children produced 25 different
word types spontaneously in a session and had around 50 words in their
vocabulary). Their ages ranged between 1;1 and 1;6 at the 4wp and 1;9 and
2;2 at the 25wp. The number of months that elapsed between the two points
ranged between four and nine (Table 14.2).
5.1 Procedure
The children were recorded at home while engaged in 3040-minute sponta-
neous interactions with their mothers, and occasionally with grandparents or
older siblings. The mothers were instructed to engage in play sessions with the
children as they normally would, using familiar toys, picture books, and other
household items, while at the same time trying to elicit words/utterances they
knew the children were able to produce.
Recordings were made in mono, 16-bit, 44.1 KHz sampling rate, using an
Edirol R9 solid-state recorder with high-quality wireless Sennheiser UHF
microphones, one worn by the mother and one hidden in a baby vest worn by
the children. Simultaneous video recordings were also made using a Sanyo
camcorder and both audio and video recordings were used for the word iden-
tication process, while phonetic transcription relied mainly on the audio. The
les were transferred onto a computer and the childs utterances segmented,
labeled, and transcribed using narrow IPA transcription for all segmental
material. Both Praat v.5.1.10 (Boersma and Weenink 2009) and PHON
v.1.5.2 (Rose 2012) were used for processing the audiovisual les (Praat
allowed easier segmentation and labeling of speech while PHON allowed tran-
scription using both audio and video outputs).
The childrens utterances were categorized as babbling (vocalizations with
no identiable target or communicative function), words (utterances with
identiable target, using Vihman and McCunes 1994 word identication
procedure), or unidentiable (utterances that were either unintelligible or
where a word target was suspected but could not be established even after
going through the word-ID test). Sessions in which the children had 4 to 25
identiable spontaneous words were included in the analyses. Imitations were
also recorded and analyzed separately to determine whether they showed differ-
ent patterns.
Table 14.2. Overall data. Number of recording sessions required from the 4wp (session 1) to the 25wp (nal reported session) for
each of the children. The table shows each childs age at the 4wp and the number of word types and tokens (in brackets) produced in
each session. Numbers include initiated productions, not counted towards 25-word criterion.
Childs name Age at start Session 1 Session 2 Session 3 Session 4 Session 5 Session 6 Session 7 Session 8 Session 9 Total
Rama 1;6.11 5
(17)
17
(50)
15
(27)
10
(23)
35
(86)
82
(203)
Martin 1;3.06 11
(75)
9
(54)
16
(57)
19
(63)
22
(41)
29
(174)
46
(140)
152
(604)
Lina 1;3.25 7
(27)
9
(28)
2
(11)
10
(31)
15
(28)
19
(57)
19
(59)
64
(206)
145
(447)
Hiyam 1;1.05 4
(15)
12
(56)
13
(48)
17
(52)
12
(22)
47
(130)
45
(105)
55
(149)
205
(577)
Mohamed 1;6.02 5
(20)
8
(42)
19
(70)
25
(70)
19
(90)
21
(97)
16
(101)
48
(153)
89
(389)
234
(1032)
All mean age: 1;4 Total: 818 (2863)
As can be seen from Table 14.2, the children vary in how quickly they get to
the 25wp, the fastest being Rama, who reached criterion within ve months, and
the slowest Mohamed, who took twice as long. Interestingly, age at the 4wp
does not predict how quickly the children will accumulate a vocabulary of
around 50 words, since at the 4wp Rama and Mohamed are coincidentally the
same age and the oldest children in the group. Both had been followed from an
early age (around 11 months), and the differences between them were obvious
right away: Rama was voluble from the start, but her utterances in the early
recordings mostly consisted of babbling and lengthy unanalyzable jargon (often
monologues) that neither her mother nor the eldworker could identify as
words. Her 4-word session at age 1;6 marked the beginning of a change in her
vocal behavior, as she became less vocal (mostly due to producing less jargon)
but began producing utterances that had identiable targets and were fairly
accurate. This remained the trend up to and including the 25wp. Mohamed, on
the other hand, was a much more cautious and quiet child at the beginning. His
mother noted that his speech was developing more slowly in comparison with
that of his older brother. He was a lot less vocal than Rama in the sessions
leading up to the 4wp and then had several sessions with no noticeable increase
in vocabulary (based on the recordings from sessions 3 to 7 and on his mothers
observations).
On average, the childrens age at the 4-word point (mean 1;4) is older than
what is sometimes reported for US English (Vihman, Ferguson, and Elbert
1986; Vihman and McCune 1994); all of the children experience a spurt in their
production at some stage around the 25wp, in terms of either the overall number
of tokens (Martin, session 6; Lina, session 6) or both word types and tokens
(Rama, session 5; Hiyam, session 6, Lina session 8, Mohamed, sessions 89).
This tends to coincide with either the session identied as the 25wp or the
session immediately before that.
5.2 General patterns
As expected, Arabic words constituted the majority of utterances at 65 percent,
followed by English (18 percent) and then French (8 percent). Words which
could belong to more than one language were labeled as multilingual and
constituted the remaining 9 percent of the data (Table 14.3 and Figure. 14.1).
Note that our interest in categorizing the utterances into the three languages here
was driven by the need to examine the inuence of the language of origin on
the syllable and word structure of the utterances that the children heard and
produced. While the majority of the utterances that were labeled English and
French in this study had commonly used translation equivalents in Arabic, they
were not necessarily code-switches on the part of the children; an account of
code-switching behavior would require a different type of discourse analysis in
order to establish whether the utterances were part of the Arabic child-directed
speech that the children heard or genuine switches to French or English
discourse by mother and/or child, which is beyond the scope of this study.
The distribution of early word shapes in Figure 14.1 reects the differences in
frequencies of mono- and disyllables in the three languages (e.g., Menn 1971;
Rose and Wauquier-Gravelines 2007; Stoel-Gammon 1987), with the majority
of early Arabic and French targeted words being disyllabic (68 and 79 percent
respectively) while the majority of English words are monosyllabic (66
percent). Here, multisyllabic word frequency cannot be compared across the
three languages because of the small numbers involved; as the childrens
productive abilities increased over the sessions, the emergence of multisyllabic
words (with more than two syllables) was most prominent in Arabic, their
dominant language. The difference in word shapes across the three languages
was also reected in the syllable structure within each word shape. For instance,
within monosyllabic words targeted by the children, the most frequent syllable
structure for Arabic words was CVC: (with a nal geminate consonant), e.g.,
/ba/
1
all gone, that for English words was CVC, e.g., cat, and for French
words it was CV, e.g. deux /d/ two (Figure 14.2). The same applies to
disyllables (Figure 14.3), with the most frequently targeted disyllabic shapes
in Arabic being CVC:V, e.g., /nanna/ food, CV:CV, e.g., /baba/ daddy,
and CVCV, e.g., /taa/ come here; the most frequent targeted shapes for
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Arabic English French
Language and syllable type
N = 2863
Monosyllable
Disyllable
Multisyllable
Figure 14.1. Distribution of target word shapes as a function of utterance
language.
Table 14.3. Language of origin for the utterances targeted by the children
Arabic English French Multilingual Total
Total 1864 (65%) 514 (18%) 217 (8%) 268 (9%) 2863
English were CVVCV(C), e.g., baby /bebi/, followed by CVCV, e.g., teddy /
tdi/ and CCVCV, e.g., story /stoi/. French disyllable shapes showed a much
more skewed pattern towards a single structure, which was CVCV(90 percent),
e.g., chapeau /apo/ hat.
CVC:
Target monosyllables
90%
N = 849
80%
60%
50%
40%
30%
20%
10%
0%
70%
CV: CV:C CVC CVV CV CV CCCV CVCC
Figure 14.2. Distribution of the three most frequent syllables shapes for
monosyllabic words targeted by the children in Arabic, English, and French
utterances. Other, less frequently used shapes are not shown
CVC:V
Target disyllables
90%
N = 1612
80%
60%
50%
40%
30%
20%
10%
0%
70%
CV:CV CVCV CVVCV CVCV CCVCV CVCV CVCCV CVCCVC
Figure 14.3. Distribution of the three most frequent syllables shapes for
disyllabic words targeted by the children in Arabic, English, and French
utterances. Other, less frequently used shapes are not shown
On the whole, disyllables constitute a large part of the childrens early
word shapes (59 percent), which is expected, given the predominance of
disyllables in Arabic and French. Moreover, due to the high frequency
and salience of the medial geminate pattern in Arabic, all children both
aim for disyllabic shapes with medial geminate or heavy targets (with
clusters or affricates) and adapt other shapes to the CVC:V(C) pattern.
Baby words (/buwwa/ water; /nanna/ food; /bsse/ pussycat) and
nicknames (e.g., /kitto/ Christopher; /lillo/ Lina) also contributed to
the high number of disyllabic words with medial geminates. Figure 14.4
shows the distribution of disyllables that were targeted (left) by the children
in terms of whether the medial consonant was a single consonant (e.g.,
/ana/ I), a geminate (/baddo/ he wants) or complex (e.g., /mfte/
key), and how they were realised (Figure 14.4, right); the complex
category included clusters (e.g., /ftbl/ football) and secondary articu-
lations (e.g., /sus/ chick) in targets, but also affricated ([bob] for
French balon ball) and other doubly articulated consonants ([ l
n]
for /ana/ I) in the realizations. Long C: and complex realisations by the
children (54 and 30 percent respectively) are around 1.5 times as frequent
as geminate and complex targets, suggesting that the medial consonants of
many words with singleton targets were lengthened or produced with
complex articulation.
51%
21%
28%
30%
54%
16%
Disyllable medial C(C)
realization
N = 1456
Disyllable medial C(C) target
N = 1691
Singleton
Complex
Geminate
Figure 14.4. Distribution of medial consonant type (single, geminate,
complex) in disyllabic words targeted by the children (left) and their
realization (right)
5.3 Developmental patterns
On the whole, the children target similar word structures in the early (4wp)
and later (25wp) stages of production (Figures 14.56), with a wider range of
word shapes at the more advanced stage and an emergence of more complex
shapes (not all listed in the gures below due to their very low frequency).
One notable difference is a 14 percent drop in disyllabic CVC:V targets at the
25wp (Figure 14.6), but not in realizations; in fact, lengthening of singleton
consonants is still prominent and actually increases at the more advanced
stage (Table 14.4, Figure 14.9). The structure of the realisations for the most
frequent target word shapes does not change very much as the children
progress to the 25wp (Figures 14.78); this is due to the fact that the children
45%
25%
15%
10%
40%
30%
20%
5%
0%
CVCV CVCCV CVCVC CCV:CV CCVCV
Target disyllables
CV:CV:C CVCC
V(:)C
CV:CV(:) CVC(:)
V:C
CV:C(:)
VC
CVC:V CV:CV
35%
4 wp
25 wp
N = 1052
Figure 14.6. Most frequent types of word structures targeted in disyllabic
word shapes at the 4wp and the 25wp. Shapes constituting less than 1 percent
of the data are not included
45%
4 wp
25 wp
25%
15%
10%
40%
30%
20%
5%
0%
CVC: CVC
Target monosyllables
CCV CVV
CCV:C
CCV: CCVC CVCC CVVC CCVV CCVVC CV:C CV: CV
35%
N = 497
Figure 14.5. Most frequent types of word structures targeted in monosyllabic
word shapes at the 4wp and the 25wp. Shapes constituting less than 1 percent
of the data are not included
produce target-like structures from an early age, if phonological length is set
aside. What they seem to take some time to acquire is phonological length,
and their patterns of acquisition seem to involve experimenting with adding
phonetic length to all elements of the target syllable structures rather than just
to the phonologically long ones, or strengthening consonants (denoted as
Cs in the gures above). For instance, a target CVC: can be produced not
16%
Target CVC: 4-word point
14%
12%
10%
8%
6%
4%
2%
0%
CV:Cs CVCV:C CV:C: CsV:C CVCs
N = 20
16%
Target CVC: 25-word point
14%
12%
10%
8%
6%
4%
2%
0%
C:V:C: C:VC CVC C:V:C CV:C
N = 94
Figure 14.7. Range of realizations for the most frequently targeted
monosyllabic word shape, CVC:, at the 4wp (left) and 25wp (right). Here
and elsewhere, Cs refers to a consonant that is articulated with extra strength/
tenseness
12%
Target CVCV 4 -word point
10%
8%
6%
4%
2%
0%
CV:C:V:C CVC:V:C CVC:VC C:V:C CV:CV: CV:CV:C
N=100
12%
Target CVCV 25-word point
10%
8%
6%
4%
2%
0%
CVC:V:C CVCV:C CV:C:V:C CVCVC CV:CV:C CVCV
N= 210
Figure 14.8. Range of realizations for the most frequently targeted disyllabic
word shape, CVCV, at the 4wp (left) and 25wp (right)
Table 14.4. Proportions of CVCV shapes being realized with a
singleton or a geminate consonant at each developmental stage
CVCV realization
Singleton Geminate/strong
4wp 46% 54%
25wp 35% 65%
just with a long coda, but also with a long onset and/or a long vowel, e.g.,
/ba/ all gone realized as [ba], [bba], [ba], etc. Similarly, a target
CVCV can be realized with varying lengths for all segments, e.g., /baba/
daddy realized as [babbah], [babba], [bbabam], and [baba], etc.; the
realizations of disyllables with open nal syllables in the target frequently
contained a nal coda, often a guttural sound (glottal stop, glottal fricative or
pharyngeal fricative) but occasionally also other consonants with supraglot-
tal places of articulation. While variable phonetic length may apply to
all childrens early productions regardless of their native language, the
fact that Arabic has phonological vowel and consonant length may increase
the salience of contrastive duration for the children, leading to their extensive
experimentation with segment length and the production of syllables with
heavy rhymes and/or codas. Acoustic analysis is currently under way in
order to obtain a clearer picture of the relationship between phonetic and
phonological length in the childrens productions.
The prominence of monosyllabic CV(:)C(:) and disyllabic CV(:)C(:)V
shapes in the targets that the children are aiming for throughout the single-
word period can also be seen at the individual level, though with interesting
differences connected to each childs starting point (the structure of their earliest
words), the relative frequency of each language that they hear, and their
individual journey towards the 25wp. The next section looks at longitudinal
data from three of the children whose data are presented here in order to explore
the interaction between language-specic and individual differences in the
development of early phonological structure. In the data presented below only
one token per lexical item is presented, chosen from the most frequent and/or
most adultlike realizations.
100%
Disyllabic targets vs. realizations
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Singleton
4-Words 25-Words
Realization
Target
Singleton Complex Complex Geminate Geminate
Figure 14.9. Target medial consonant type and realizations in disyllabic
productions at the 4- and the 25wps. N = 1142
Table 14.5. Martins selected and adapted form over the one-word stage.
Words were considered selected if the adult target matched the pattern of
interest and adapted if they were modied to t the childs pattern(s). Shaded
grey is used for imitations. Here and elsewhere, the half-length symbol
following a consonant was used both for half-long and/or noticeably strong/
tense articulation. Italics = French or English target
Table 14.5. (cont.)
ill
Table 14.5. (cont.)
ill
5.4 Individual paths and templatic behavior
5.4.1 Martin Martin was exposed mostly to Arabic, often mixed with
French, and his production in the seven sessions that were analyzed reects
that exposure (77 percent of his utterances are Arabic, followed by French at 13
percent and English at 6 percent). His 4wp was identied at age 1;3, which is close
to the mean age at the 4wp for the children studied here. He is the most systematic
of the children in that his earliest productions fell mostly in the CVC:V(C) pattern,
and this remained his favorite structure throughout. Below is a more detailed
account of Martins phonological patterns across the one-word stage.
At the 4wp all but one of Martins word types have the CVC:V(C) shape, and
the majority of these (88 percent of tokens) show consonant harmony either in
the target or the realization or in both (Table 14.5). Martin adapts both mono-
and multisyllables to the disyllabic shape with a long medial C:, e.g., French
train /t
/ [tttah] train; /habbu:ba/ [bbh] Habbouba (nickname for

Hiba). He reduces initial consonants more frequently than medial ones, and
experiments with the lengths of all segments involved, e.g., /nanna/ food is
realized as [n
nn
h], but also [ne
h], [ enna
h], [jnn
h], [h
nna

h],
[n
nn
n], etc. An initial anchor syllable is often used as a form of support

for initial consonants, lengthening the initial consonant, e.g., [nnnn
h] for
/nanna/ above. As expected, Martins consonant inventory at the 4-word point is
relatively small, mainly consisting of bilabial and alveolar nasals and stops,
along with glottal stops and fricatives (Appendix).
Over the next two sessions, Martin maintains this pattern but also produces
relatively target-like words with disyllabic CV(:)CV(C) and monosyllabic
CV(:)C(C) patterns, e.g., /mam
/ [mmh] mummy; /aww/ [a ww]

woof; /teta/ [t t] grandma. His consonant inventory remains relatively
stable, with some attempts to target newstops and glides (/p(p)/ and /k(k)/, /()/
and /w(w)/). There is also an emergence of glottal and vowel-like codas for
target codas (e.g., /da/ [d] nice; /ajj/ [ah] ouch; /aww/ [a ww]
woof), alongside the frequent addition of the glottal and supraglottal codas
that Martin and the other children in this study exhibit (e.g., /baa/ [
m]
Table 14.5. (cont.)
peek-a-boo; /nanna/ [n
nn
n] food; /bbbo/ [bbuh] baby). Sessions 4

and 5 contain a large number of imitated and relatively few spontaneous utter-
ances; since their phonological prole is similar, these have been combined in
Table 14.5. These sessions also exhibit the prominence of the CVC:V(C) pattern
with consonant harmony, particularly in adaptations (e.g., /fadja/ [de dd ]
Faadia (proper noun, henceforth PN); French Jesus /ezy/ [dyddy
] Jesus).
Medial consonants in Martins imitated utterances are more target-like than
initial ones, which are more variable. Words with target fricatives and liquids
are targeted in imitations for the rst time, with frequent stopping and other
adaptations, adding to the prominence of consonant harmony.
Martins last two sessions show a marked change in the frequency of words
produced as well as a growing consonant inventory (Appendix), but Martins
preferred CVC:V(C) pattern is still prominent, with adaptations that are twice as
frequent as the selected words with this pattern in session 6. These stand out
compared with the other minor word shapes that Martin produces, which tend to
be more accurate, e.g., CV(V)C(C): English thats /ats/ [dat
s], Arabic /wen/

[
wn ] where. The majority of words that Martin targets are still disyllables,
and despite his increased phonetic and phonological inventory his productions
still exhibit frequent consonant harmony. In the last session the CVC:V(C)
shape rises to 71 percent of all of Martins productions, the highest since his rst
session, which suggests that the medial long consonant template is at its most
productive for Martin as he approaches the 25wp. Consonant harmony is not as
prominent in this session, as newconsonants are attempted and coda consonants
are more frequent. Session 7 also sees the geminate/long pattern being applied
to longer words as Martin starts producing multisyllabic words; multisyllabic
words with medial geminates like /tattuna/ [tttnh] nickname for Martin
and /battajjet/ [
cc
tt
t] batteries are selected, while disyllabic words

are sometimes adapted to the multisyllabic shape with one or two internal long
consonants, e.g., /bbbo/ [be b b b h] baby and /nanna/ [j
h] food.
Although Martins consonant inventory is expanding, variation in the realiza-
tion of some consonants is higher than in earlier sessions, especially in initial
position, e.g., for /b/, /k/, and /m/ (see Appendix). Medial codas are targeted but
are often assimilated to the next onset, adding to the geminate pattern, e.g.,
/mala/ [

mm

h] spoon, /min hon/ [x
nn

nh] whos there?, and
/mfte/ [c a t
] key, but awareness of medial codas is noticeable and some

disyllables are adapted to that pattern, e.g., /mam
/ [bm
h] mummy,
/koko/ [kl
k h] nickname.
5.4.2 Rama Rama was exposed to more English than Martin, and 30 percent
of her utterances were English. Arabic still constituted the majority of her utter-
ances at 60 percent. Ramas rst two sessions are combined, due to the small
number of spontaneous utterances in her rst session (Table 14.6). Her prole at
this early stage of production is strikingly different fromMartins, mainly due to the
high frequency of monosyllabic words that she produces (46 percent). The
disyllabic geminate pattern is prominent as well (40 percent of utterances), with
adaptations such as /baba/ [mbb h] daddy; /kle/ [ke] eat!; /hajda nz/
[
dnnz] thats (a) nose. Perhaps due to Ramas jargon practice and older age at
this stage, her consonant inventory is more varied than Martins at the 4wp
Table 14.6. Ramas selected and adapted forms over the one-word stage;
shaded grey is used for imitations
miaow
(Appendix), with a small number of fricatives and laterals alongside stops and
nasals as well as nal consonants and occasional two-word utterances (e.g., /jalla
kle/ [jake] come on eat; /ptn mni/ [pd md
] put on Minnie).
In the next two sessions (age 1;7 and 1;8) the two patterns identied at the
4wp still make up the majority of utterances, though the prominence of the
disyllabic geminate pattern is due more to frequency of use (48 percent of
utterances) than to type (28 percent of the total of different words).
Monosyllabic CV(:)(C) is the most varied and productive shape, showing a
nal glide pattern (e.g., /hajj/ [ j] this; /ba/ [bjj] bye; /waw/ [b ww]
wow) and a front mid-high to mid-low vowel pattern (e.g., French danse /d
s/
[teh
s]; merci /msi/ [h
s]; /mijaw/ [n
m ] miaow).
Table 14.6. (cont.)
In the nal session the monosyllabic CV(:)(C) shape becomes the most prom-
inent, accounting for 40 percent of all utterances. Within this shape a subset of
productions still have the nal glide pattern, as in previous sessions (Table 14.6),
but others include other consonants as well and a rich variety of vowels (e.g., /ba/
[b] all gone; /mbu/ [mbu:u] water; English Po (name of TV character) /po/
[po]). The inuence of words of English origin is obvious in the frequency of
monosyllabic words in Ramas sessions, with words like Po, bye, wow, ball, eyes,
book, and nose making up a large proportion of her productions, especially in the last
session. The second most frequent pattern in this session is a disyllabic C(:)V(:)CV
shape (29 percent of utterances), which takes over from the medial long C(:) as the
second most frequent shape (e.g., /teta/ [teta] grandma; /mama/ [mma:ma h]
mother; English baby /bebi/ [bebi]). These and all but one of the monosyllabic
words are selected and, apart from expected developmental features, they are
fairly accurate. In fact, most of Ramas productions in the nal session are
essentially accurate; in comparison with Martin, she produces fewer utterances
and fewer repetitions of words (98 types and 203 tokens over ve sessions for
Rama, compared with 179 types and 604 tokens over seven sessions for Martin)
but the words tend to be more accurate and her production exhibits no large-scale
adaptations to any preferred shape. The only pattern that still shows more adapta-
tion than selection is the disyllabic long/geminate pattern (e.g., /tiktak/ [ti:ttih]
sweet; /ba/ [ba] all gone; English oven gloves /ovn lvz/ [au]),
though the frequency of occurrence of this pattern is now down to 20 percent.
5.4.3 Lina Lina was exposed to both French and English on a regular basis,
and her production reects that, with Arabic productions constituting only about
half of her overall utterances at 48 percent, followed by French (28 percent) and
English (21 percent). Her rst three sessions, between 1;3 and 1;5, had similar
patterns and no increase in the number of spontaneous words produced, so they are
combined here for analysis. In these sessions, Linas utterances can be grouped
into the three word shapes identied for the other two children, with the disyllabic
pattern with a long medial C: being the most frequent (e.g., French Oui Oui (PN)
/wiwi/ [wwih]; maman /mam
/ [m mm ]; English thank you /ak ju/

[ tt]). Interestingly, these early words are all disyllabic French or English
targets with lengthened medial consonants. Lina also produces disyllabic words
with short medial consonants (e.g., /alam/ [h] ag; /alo/ [v] hello)
and monosyllabic shapes which consist of either monosyllabic targets (mostly
English and French numbers and letters of the alphabet [Table 14.7]) or reductions
of disyllabic words (e.g., /sabat/ [bt] it stayed still; /ati/ [ts] give). In
terms of her consonant prole, over half of Linas words have selected or adapted
initial glottals, while in medial position she targets and produces a collection of
bilabial and alveolar sounds; these include /l/, which is advanced relative to her
age but which might relate to her own name having an /l/ in it (see data from
Laurent in Vihman 1993). Other relatively advanced sounds that she produces
include dental, labiodental, and alveolar fricatives (See Appendix).
Over the next two sessions Linas production of monosyllabic CVC words
increases, mostly due to her engaging in various games around counting and
reciting the letters of the alphabet in French and English with her mum.
Her disyllabic CVC:V(C) pattern is less prominent during those sessions, but
interestingly it is the only pattern which shows active adaptations on Linas part in
comparison with the mostly selected words fromthe monosyllabic and disyllabic
shapes with medial short C (e.g., /ati/ [tti] give; French trois /twa/
[ jjeh]; chocolat /okola/ [koll
]). This pattern becomes stronger and more

prominent over the next two sessions until it makes up 79 percent of all of Linas
productions at age 1;9, the session immediately prior to her 25wp. Lina is actually
very close to the 25wp at age 1;9, as in the last session she produces 41
spontaneous different words and has therefore moved beyond the rst 50
words (she also produces many words in general during the last session, four
times as many as in the previous session). Words like sh, which previously had
monosyllabic realizations, now acquire disyllabic forms (1;6 [ps] but 1;9
[ps s ]). Lina also starts applying the lengthening pattern to multisyllabic targets
(e.g., /televizj/ [deitte h] telly; /bskote/ [kko
t h] biscuit), multisyl-
labic realizations of disyllabic targets (e.g., /doa/ [wwl l
d ] Dora; /lala/
[ ell lh] Lala), or across word boundaries (e.g., /la ma badde/
[alla
ddi
h] no, I dont want to; /waa batata/ [w t t t h] he

dropped potato). Therefore, despite the frequency of French and, to a lesser
extent, English words in Linas vocabulary, her disyllabic pattern with a medial
geminate has become as strong as Martins by the 25wp. This may be due to
the lengthening of many medial consonants in French words by adults in the
community. Of the monosyllabic words that Lina produces in that last session a
large proportion (55 percent) now have a long or strong/heavy rst consonant
(e.g., /iddo/ [dd ] grandpa; /da/ [d ] nice; French si /si/ [ssih]), perhaps
showing inuence fromthe disyllabic geminate pattern. The same applies to more
than half of the disyllabic shapes with short medial consonants (e.g., /nimo/
[mimh] Nemo; /lalo/ [lle] nickname for Elias). The remaining mono-
syllabic and disyllabic productions are mostly selected and largely accurate.
6 Summary and discussion
This study looked at early production patterns in ve Lebanese-speaking
children between the beginning and end of the one-word stage. The aim was
Table 14.7. Linas selected and adapted forms over the one-word stage;
shading is used for imitations
(PN)
Table 14.7. (cont.)
Table 14.7. (cont.)
(PN)
Table 14.7. (cont.)
to provide new data on early word shapes in Lebanese Arabic and to look for
patterns in the childrens production which may be indicative of the templatic
behavior reported in other languages. In terms of word shapes, we predicted
that childrens early words would show the inuence of the frequent disyllables
with medial gemination that are common in Arabic and that medial and nal
consonants would be acquired early, leading to the early emergence of relatively
complex syllable structures. In terms of templatic behavior, we predicted that
disyllabic shapes with medial long consonants would dominate childrens
preferred patterns and lead to adaptations of other word shapes to the CVC:V
(C) shape; moreover, given that template patterns are inuenced by language
exposure and the childs individual experience with early words, we predicted
that individual differences and the childrens varying exposure to English and
French would also play a role in howearly these patterns would appear and how
systematic their productions would be. The ndings support our predictions and
highlight the special role of phonological length in Arabic in the childs
acquisition of lengthening as a suprasegmental feature and the childrens
tendency to overgeneralize this feature before achieving target-like production.
Below we revisit some of these ndings and discuss their implications for the
relationship between accuracy and phonological advance.
6.1 The prevalence of disyllabic structures from an early age
The data presented here show that the rich and minimally bimoraic word shapes
of the Arabic language (Broselow 1992; McCarthy and Prince 1986, 1990b;
Watson 2002) are exhibited in Arabic-speaking childrens early word production.
Furthermore, the difference in the distribution of word shapes from the three
languages targeted by the children (Figure 14.1) provides an insight into how the
prosodic shapes of early words vary across languages. While the Arabic and
Table 14.7. (cont.)
French words targeted were mostly disyllabic, the majority of English words
targeted were monosyllabic with codas. As a group the children produced Arabic
the most, followed by English and then French. Disyllables were therefore
targeted the most, and monosyllabic and multisyllabic words were often adapted
to the disyllabic shapes. The children also frequently produced a ller syllable at
the beginning of the word (Peters 2001), which increased the percept of multi-
syllabic production. The use of initial ller vowels or syllables by children as a
speech initiation strategy is not uncommon (see, for instance, Sis data in Macken
1979). In this study, the most common ller used by the children was a CV
syllable consisting of a glottal stop followed by a neutral vowel, but there were
other CV shapes as well; our impression is that children often used these as a
springboard for word production, as if to initiate articulation. Another possibility
is that the children were producing dummy syllables based on the frequent
occurrence of the denite article /al/, which assimilates to coronal onset con-
sonants in following nouns (e.g., /al/ + /ams/ is realized as [aams] the sun).
The children produce a wide variety of syllable structures from an early age,
including syllables with nal codas. Final consonant deletion, which is common
in the production of children acquiring English and Spanish (Macken 1979),
was not found to be frequent in the production of the Arabic-speaking children
in this study. In fact, these children were more likely to add nal codas to words
which would otherwise end in vowels than to delete them. These results there-
fore agree with other acquisition studies which have suggested that Arabic-
speaking children acquire a range of complex syllable shapes from an early age
(e.g., Abdoh 2011; Ammar 2002).
6.2 The role of gemination in phonological advance
As a group, the children both target and produce more disyllables with gemi-
nate/long consonants than any other word shapes. Further work on the fre-
quency distributions of word shapes in the adult language is needed, but the
sparse literature on Arabic phonology suggests that CVC:V(C) is a frequent
and productive pattern in the language, being used in both nouns and form II
verbs (Watson 2002). A large part of the CVC:V(C) realizations were also
adaptations of a CV:CV(C) target, with the children shifting length from the
preceding vowel to the medial consonant (e.g., /baba/ realized as [babbah]
daddy). Lengthening was often applied to more than one segment in a word
and was also variable. As Macken (1979: 29) points out, when words are treated
as prosodic units the child may freely swap features within the unit, the feature
being swapped here being segment length. Initial consonants were also occa-
sionally preceded by ller syllables, which turned the original initial consonant
to a medial one that was then lengthened (e.g., French dodo /dodo/ night night
realized as [ddddh]).
While the prosodic CVC:V(C) shape was a consistent target that the children
aimed for or adapted words to, their production of the segmental material in
each word was quite variable, as evidenced by analysis of several repetitions of
the same word. On the whole, initial consonants varied more than medial ones
and were more often reduced, but interestingly there was hardly any case of the
initial consonant deletion that is often reported for languages with medial
geminates, where the geminate position diverts the childs attention to the
medial consonant (Bhaya Nair 1991; Savinainen-Makkonen 2007; Vihman
and Velleman 2000; Vihman and Vihman 2011). So while the childrens higher
accuracy for medial consonants and codas chimes in with ndings on other
Arabic dialects (e.g., Amayreh and Dyson 1998; Dyson and Amayreh 2000;
Shahin 2003) and other languages (e.g., Bhaya Nair 1991; Szreder this volume),
the importance of onsets in the phonological structure of Arabic words may
have played a role in the maintenance of onset consonants by the children, even
if their realization was more variable.
Although the children were on the whole more accurate in their segmental
productions towards the end of the 25wp, their realisation of phonological length
became less accurate as they adapted more words to the geminate template (e.g.,
Figure 14.9, Table 14.4). This coincided with their vocabulary showing a quan-
tum leap in terms of word types and/or tokens (Table 14.2). Apart from the
children becoming more systematic in their production of the CVC:V(C) pattern,
two factors contributed to the increased production of medial long consonants: (a)
targetting of medial codas, which were often assimilated to the following con-
sonant (e.g., /xamse/ ve realized as [ szzh]) and (b) the emergence or, for some
children, increase in the production of multisyllabic words, in which one or more
medial consonants were lengthened in the same way as disyllables. We hypothe-
size that this U-shaped curve or decrease in accuracy, which is often reported in
other studies, is the childrens way of using a well-practiced and articulatorily
accessible production routine, the CVC:V(C) shape, to aid themin aiming for and
learning newand longer or more challenging words. The result was less accuracy
in achieving target phonological length in the later recordings due to overuse of
the medial long consonant pattern, even as childrens phonetic and phonological
inventories were starting to look more adultlike.
As Savinainen-Makkonen (2007) observes for Finnish children, we think that
Lebanese-speaking children use the CVC:V(C) prosodic shape as an anchor to
practice newwords and adapt them if their target formdoes not t that shape. The
outcome may not resemble the patterns found in the adult phonology, but it is a
sign of the children being actively involved in doing phonology; this is evident
both in the way that the children select groups of words that match the phono-
logical structures that they hear in the input and that they are able to produce, and
the way they adapt other words to t the prosodic shape that they are familiar with
producing. This comes at a time when the children have more articulatory control
and a richer consonant inventory, and therefore have less maturation-related reason
to lengthen target singleton consonants. Given that two of the three children whose
individual data we looked at here show regression in accuracy in terms of the
realization of phonological length towards the end of the 25wp, this calls into
question whether their earlier sessions with target-like length show true acquis-
ition of the singletongeminate contrast. We suspect that the early accuracy might
reect an item-learning phase when the link between singleton and geminate
consonants has not yet been acquired. In the later recordings, the childrens
overuse of long durations suggests their growing attention to this salient phonetic
and phonological characteristic of Arabic and their application of length as an
active process in the production and learning of new words. We predict that the
return to accuracy in the realization of consonant length, which is to be expected in
the third year of life once templatic behavior has receded and the childrens
productions are more adultlike, will represent real acquisition of gemination.
6.3 Individual differences
As part of normal variation of language and linguistic use within Beirut, the data
reported here show varying use of words from English and French across the
ve children, and this contributed to the individual differences that were evident
in both their segmental development and their early word shapes (though only
the latter was dealt with in detail in this chapter); it also made a difference to
whether or not the children showed any systematic patterns in the early record-
ings. For instance, one reason Martin appeared to be the most systematic from
the start is because he targeted a higher proportion of Arabic words than any of
the other children, and many of these were disyllabic. This, together with his
frequent use of consonant harmony and over-reliance on the medial geminate
pattern, made his productions look very systematic and template-like from the
earliest recordings, when typically there are not enough productions for any
patterns to stand out. The prevalence of consonant harmony and medial length-
ening in the later recordings, at an age when his consonant inventory was
expanding and he was beginning to produce multisyllabic words, cemented
the conclusion that the C
1
VC:
1
V pattern is a preferred shape for Martin rather
than the consequence of articulatory constraints and/or a small consonant
inventory. This early systematicity was not found for the other children and
conrms ndings elsewhere that not all children apply consonant harmony as an
active phonological process (Macken 1979; Vihman 1978).
Rama, on the other hand, targeted more English words from the start than
Martin or Lina and, as a result, produced many more monosyllabic structures
than the other children. This, added to the fact that her productions were
generally accurate and that she did not reach the 4wp till relatively late, made
it more challenging to capture a systematic stage for her before her productions
became target-like. Within her monosyllabic productions, a weak pattern for
nal glides could be pinpointed in the later sessions, similar to what has been
reported for English children in nal position in monosyllables and in medial
position in disyllables (e.g., Priestly 1977; Vihman et al, 1994). The monosyl-
labic shape remains the preferred one for Rama and constitutes 40 percent of her
productions at the 25wp, but what is interesting is that it consists more of
selected than adapted words. Her disyllabic medial long consonant pattern, on
the other hand, is in decline in the last session (constituting 20 percent of her
productions) but shows more adaptation than selection, suggesting that even for
a child like Rama, who produces more monosyllables, templatic behavior is
evident in her disyllabic productions.
Linas prole too can be partly linked to her language exposure/use, with
around half of her productions consisting of Arabic utterances, followed by
French and then English. With French having frequent disyllables like Arabic,
Linas productions were, as expected, mostly disyllabic; but while the disyllabic
shape with a medial long consonant emerged as the most frequent pattern in the
early recordings, it was not as systematic as what was found in Martins data,
and Lina still produced many monosyllabic and disyllabic words with singleton
consonants. Out of the data sets presented in detail here, Linas longitudinal data
provide the best example of a decrease in accuracy as a result of the application
of a templatic pattern in the later sessions. Following the early sessions in which
the disyllabic CVC:V(C) pattern was frequent in Linas productions, her middle
sessions were more accurate and more diverse in terms of the word shapes
produced, with many fairly accurate monosyllabic productions. Towards the
25wp, however, the CVC:V(C) pattern again became more prominent, with
many adaptations and a decrease in accuracy (for example, the decrease in
accuracy in the production of sh), just as her vocabulary was rapidly expand-
ing. We see a qualitative difference between the apparent systematicity of the
early sessions, where Linas lexicon is still small, and the later more active
application of the medial geminate pattern, at a time when articulatory control is
more advanced. More research is needed to look at individual differences in
preferred word shapes and howtheir patterns evolve over time within a group of
children with comparable language exposure.
7 Conclusion
This study is the rst detailed investigation of Lebanese Arabic childrens early
word patterns, with a focus on the transition that the child makes from the item-
based production of the rst few words towards more generalized learning and
phonological systematicity. This is achieved both in the way children gradually
move towards adultlike word shapes and segmental productions and in the way
they form their own generalizations about word shapes and apply these to new
incoming words so that, for a short time, their accuracy may decrease. The
children in this study all produced many disyllabic word shapes with medial
long consonants due to their frequency in the adult input. However, their indi-
vidual preference for this pattern varied across sessions and between children,
depending on the frequency with which they heard and produced other languages
and on their individual preferences. Differences were also present in their seg-
mental inventories and the degree to which they applied early developmental
patterns such as consonant harmony. Despite the prevalence of onsets in the
childrens productions, syllables with heavy rhymes or codas were produced from
an early age, and the children were more accurate in their production of medial
than initial consonant position. Their data therefore adds to the growing number
of studies on languages with quantitative contrasts that challenge the universal
attention to initial consonants that is sometimes implied. Medial gemination was
used by the children as an active process that enabled them to select words with a
familiar rhythmic shape and to adapt other words to that shape. In the later stages
of development, this was extended to multisyllabic word production.
Gemination has not received much attention in the literature on Arabic
acquisition despite its high functional load and the discrepancy between the
phonetic and phonological challenge involved in its acquisition. This study
therefore constitutes a rst step toward offering a detailed account of the
acquisition of gemination in Lebanese Arabic. Current work is looking at the
acoustic indices for gemination both in adult and child production in order to
better understand the process by which children acquire the singletongeminate
contrast; data from later sessions are also being analyzed in order to explore the
inuence of morphosyntax on the acquisition of this contrast.
note
1. Geminates are transcribed as double consonants in the IPA transcriptions
throughout, but as C: in syllable structure notation in order to separate them from
consonant clusters, which are denoted as CC. Long vowels are denoted as V:
and diphthongs as VV.
References
Abdoh E. (2011). A study of the phonological structure and representation of rst words
in Arabic. Unpublished PhD dissertation, University of Leicester.
Al-Tamimi, J. and Khattab, G. (2011). Multiple cues for the singletongeminate contrast
in Lebanese Arabic: acoustic investigation of stops and fricatives. In W. S Lee and
E. Zee (eds.), Proceedings of the 17th International Congress of Phonetic Sciences,
Hong Kong, August 1721, 2011, pp. 21215.
(2012). Acoustic cue weighting in the singleton vs geminate contrast in Lebanese
Arabic: the case of fricative consonants. Unpublished MS.
Amayreh, M. and Dyson, A. (1998). The acquisition of Arabic consonants. Journal of
Speech, Language and Hearing Research, 41, 64253.
Ammar, W. (1999). The acquisition of consonant clusters in Egyptian children from two
to four years. Language Sciences, 2(3), 1037.
(2002). Acquisition of syllabic structure in Egyptian Colloquial Arabic. In F. Windsor,
L. M. Kelly, and N. Hewlett (eds.), Investigations in clinical phonetics and linguis-
tics, pp. 15360. Mahwah, NJ, and London: Lawrence Erlbaum.
Ammar, W. and Morsi, R. (2006). Phonological development and disorders: Colloquial
Egyptian Arabic. In Z. Hua and B. Dodd (eds.), Phonological development and
disorders in children, pp. 20432. Clevedon: Multilingual Matters.
Bhaya Nair, R. (1991). Monosyllabic English or disyllabic Hindi? Language acquisition
in a bilingual child. Indian Linguistics, 52, 5190.
Boersma, P. and Weenink, D. (2009). Praat: doing phonetics by computer (version
5.1.10) [Computer program]. Retrieved from www.praat.org.
Boudelaa, S. and Marslen-Wilson, W. D. (2001). Morphological units in the Arabic
lexicon. Cognition 81 (1), 6592.
(2004). Abstract morphemes and lexical representation: the CV-Skeleton in Arabic.
Cognition, 92 (3), 271303.
Broselow, E. 1992. Parametric variation in Arabic dialect phonology. In E. Broselow,
M. Eid, and J. McCarthy (eds.), Perspectives on Arabic linguistics, pp. 745.
Amsterdam and Philadelphia: John Benjamins.
Demuth, K. 1995. The prosodic structure of early words. In J. Morgan and K. Demuth
(eds.), From signal to syntax: bootstrapping from speech to grammar in early
acquisition, pp. 17184. Hillsdale, NJ: Lawrence Erlbaum.
Dyson, A. and Amayreh, M. (2000). Phonological errors and sound changes in Arabic
speaking children. Clinical Linguistics & Phonetics, 14, 79109.
Fee, J. (1995). Two strategies in the acquisition of syllable and word structure. In
E. V. Clark (ed.), Proceedings of the 27th Child Language Research Forum,
pp. 29 38. Stanford, CA: CSLI Publications.
Harris, J. and Gussman, E. (1998). Final codas: why the west was wrong. In E. Cyran (ed.),
Structure and interpretation: studies in phonology, pp. 13962. Lublin: Folium.
Hayes, B. (1989). Compensatory lengthening in moraic phonology. Linguistic Inquiry,
20, 253306.
Khattab, G. (2007). Lebanese speech acquisition. In S. McLeod (ed.), The international
guide to speech acquisition, pp. 30012. Clifton Park, NY: Thomson Delmar Learning.
Khattab, G. and Al-Tamimi, J. (2008). Durational cues for gemination in Lebanese
Arabic. Languages and Linguistics, 22, 3956.
(2012). Geminate timing in Lebanese Arabic. Unpublished MS.
Nasr, R. T. (1960). Phonemic length in Lebanese Arabic. Phonetica, 5, 20911.
(1966). Colloquial Arabic: an oral approach. Beirut: Librarie du Liban.
Obrecht, D. H. (1968). Effects of the second formant on the perception of velarisation
consonants in Arabic. The Hague: Mouton.
Macken, M. A. (1979). Developmental reorganization of phonology: A hierarchy of
basic units of acquisition, Lingua, 49, 1149. Reprinted in this volume as Chapter 5.
McCarthy, J. (1982). Prosodic templates, morphemic templates, and morphemic tiers. In
H. van der Hulst and N. Smith (eds.), The structure of phonological representations,
part 1, pp.190223. Dordrecht: Foris.
McCarthy, J. J. and Prince, A. (1986). Prosodic morphology. MS, University of
Massachusetts, Amherst, and Brandeis University, Waltham.
(1990a). Foot and word in prosodic morphology: the Arabic broken plural. Linguistic
Enquiry, 8, 20983.
(1990b). Prosodic morphology and templatic morphology. In M. Eid and J. McCarthy
(eds.), Perspectives on Arabic linguistics: papers from the Second Symposium, pp.
154. Amsterdam: John Benjamins.
Ota, M. (this volume). Lexical frequency effects on phonological development: the case
of word production in Japanese.
Peters, A. M. (2001). Filler syllables: what is their status in emerging grammar? Journal
of Child Language, 28, 22942.
Priestly, T. M. S. (1977). One idiosyncratic strategy in the acquisition of phonology,
Ravid, D. (2002). Adevelopmental perspective on root perception in Hebrewand Palestinian
Arabic. In J. Y Shimron (ed.), Language processing and acquisition in languages of
Semitic, root-based morphology, pp. 293319. Amsterdam: John Benjamins.
Rose, Y. (2012). Phon (version 1.5.2). [Computer program]. Retrieved from http://phon.
ling.mun.ca/phontrac/wiki/Downloads.
Rose, Y. and Wauquier-Gravelines, S. (2007). Acquisition of speech in French. In S. Mc
Leod (ed.), International guide to speech acquisition, pp. 36485. Florence, KY:
Thomson Delmar Learning.
Saleh, M., Shoeib, R., Hegazi, M., and Pakinam, A. (2007). Early phonological develop-
ment in Arabic Egyptian children: 1230 months. Folia Phoniatrica et
Logopaedica, 59, 23440.
Salem, H. (2000). Study of the acquisition of the syllable structure in sentence perspec-
tive in the speech of normal Egyptian children. Unpublished PhD dissertation
University of Alexandria.
Shahin, K. (1995). Child language evidence on Palestinian Arabic phonology. In
E. V. Clark (ed.), Proceedings of the 26th Child Language Research Forum,
pp. 1046. Stanford, CA: CSLI Publications.
(2003). Prosody-segmentism in the acquisition of Arabic: word-nal onsets and no
stress effects Paper presented at the University of British Columbia Child
Phonology Conference, Vancouver, July 14.
Stoel-Gammon, C. (1987). Phonological skills of 2-year-olds. Language, Speech and
Hearing Services in Schools, 18, 3239.
Stoel-Gammon, C. and Cooper, J. (1984). Patterns of early lexical and phonological
Vihman, M. M. (1978). Consonant harmony: its scope and function in child language. In
J. H. Greenberg (ed.), Universals of Human Language, vol. 2: Phonology
pp. 281334. Stanford University Press.
(1993). Variable paths to early word production. Journal of Phonetics, 21, 6182.
Chapter 2.
Vihman, M. M., Velleman, S. L., and McCune, L. (1994). How abstract is child
phonology? Towards an integration of linguistic and psychological approaches. In
M. Yavas (ed.), First and second language phonology. San Diego: Singular
Vihman, M. M., Vihman, V-A. (2011). From rst words to segments: a case study in
phonological development. In I. Arnon and E. V. Clark (eds.), Experience, variation,
andgeneralization: learningarst language, pp. 10933. Amsterdam: JohnBenjamins.
Watson, J. (2002). The phonology and morphology of Arabic. Oxford University Press.
Wauquier, S. and Yamaguchi, N. (this volume). Templates in French.
APPENDIX: Consonant inventory for Martin, Rama, and Lina at the beginning and end of the one-word stage. Symbols printed in gray occur only in
imitated productions
15 Lexical frequency effects on phonological
development: the case of word production
in Japanese
Mitsuhiko Ota
1. Introduction
Much research on childrens linguistic development has underscored the impor-
tance of the word as a unit of processing in early phonological production.
Many non-adultlike phonological patterns observed in childrens word produc-
tion, such as pervasive assimilation of noncontiguous segments and imposition
of xed phonetic sequences, are best understood as processes that treat the
whole word as an unanalyzed unit (Macken 1992; Vihman 1996; Waterson
1971). These processes not only mold the phonetic shape of early production,
but also appear to bias childrens selection of target words toward those that
closely match the favored patterns (Vihman and Velleman 2000). An important
corollary of these ndings is that a childs development of a particular sound or
sound pattern is deeply embedded in the context of individual words. In a
seminal paper addressing this issue, Ferguson and Farwell (1975: 437) drew
attention to the considerable amount of cross-lexical variability that exists in
childrens production of comparable phonological units. For example, one of
the children in their study, T, at one point produced /b/ in ball consistently as [b],
but her production of the initial /b/ in baby uctuated between [b] and [], and in
book, it was deleted fromtime to time. Such variability, as Ferguson and Farwell
argued, suggests that children are initially learning how to produce particular
words rather than the individual sounds that make up the phonological inven-
tory of the language.
The lexical dependency of phonological development can also be observed in
the timing of acquisition (Ferguson and Farwell 1975; Macken 1979). New
types of phonological patterns are often seen to emerge in the production of a
limited set of words, which then spread to other words (Berg 1995; Macken
1992; Menn and Matthei 1992), sometimes leaving lexical items with a similar
phonological prole unaffected for a long period of time (Macken 1979;
Moskowitz 1970, 1980; Johnson, Lewis, and Hogan 1997). Thus, the acquis-
ition of a target sound pattern can be lexically gradual, defying across-the-board
characterization of its exact timing.
Is such lexical variation simply noise in an otherwise regular process of
phonological acquisition or does it reect a systematic relationship between
the sounds that are learned and the words that contain those sounds? Several
415
researchers have suggested that one way to answer this question is to borrow
insights from research on an analogous process in diachronic sound change
usually referred to as lexical diffusion a process by which a historical change
begins in a subset of vocabulary items and then gradually spreads through the
lexicon (Ferguson and Farwell 1975; Hsieh 1972; Gierut 2001; Phillips 2006).
A key factor in diachronic lexical diffusion is word frequency. Unless it is
conditioned by non-phonetic factors such as word class, sound change usually
occurs rst in the most frequent words (Wang and Chen 1977; Phillips 2006).
Indeed, a similar link between word frequency and phonological production has
been reported in some developmental studies. For example, second graders are
more accurate in their articulation of /s/ in initial clusters contained in high-
frequency words (Leonard and Ritterman 1971). Three- to seven-year-old
children with phonological delay show more generalization of treatment in
their production of a particular target segment when they are trained on high-
frequency words than on low-frequency words (Gierut, Morrisette, and
Champion 1999; Morrisette and Gierut 2002). However, it is still not clear
whether this relationship between lexical frequency and phonological acquis-
ition also applies to the word production of younger typically developing
children such that target-like sound production will tend to appear rst in
more frequent words.
Before examining this question in more detail, we need to esh out a few
issues related to lexical frequency effects. First, there is the question of what
type of lexical statistics may be relevant to phonological development and why.
Lexical frequency may have an impact on the childs phonological system
because repeated exposure to exemplars of the target word leads to a better-
specied mental representation of the phonological information in the word.
Under this interpretation, the relevant type of frequency will be that of the input;
that is, the frequency with which the child hears each lexical item in the ambient
language. Alternatively, children may become more accurate in their production
of particular words as they gain experience in articulating them. The relevant
type of frequency for this hypothesis will be that of the output, or the frequency
with which the child attempts to produce each lexical item. Although these two
types of lexical frequency are practically the same in adult language use, they
may be quite divergent in young children, and certainly so by denition in
prelinguistic infants. Our exploration here focuses on the impact of input
frequency, although this decision does not preclude a role for output frequency
(for discussion of the effects of production frequency, see for example Tyler and
Edwards 1993 and Keren-Portnoy, Vihman, DePaolis, Whitaker, and Williams
2010). The reason for examining input frequency rst is twofold. One is that
experimental evidence shows, perhaps unsurprisingly, that repeated exposure
improves childrens phonological encoding of novel words (Schwartz and
Terrell 1983; Swingley 2007). The other is that it is much easier to obtain
reliable frequency statistics in child-directed speech than in child-produced
speech, which is more liable to estimation errors due to timing sensitivity
416 Mitsuhiko Ota
(i.e., childrens productive vocabulary is in constant ux) and sampling sparsity
(i.e., young children produce much less in a short recording session).
The second issue pertinent to the effects of lexical frequency is the role of
phonological elements and structures. If lexical diffusion in phonological
development is genuinely a matter of lexical familiarization, children should
become better at producing more frequent words regardless of the phonolog-
ical makeup of the words. However, high lexical frequency does not guarantee
early mastery of every aspect of the phonology of a word. For example,
English has several very frequent words that contain the segment //, such
as this, that, and there, but the acquisition of // is consistently late even in
these words (Hodson and Paden 1981). Similarly, the cluster stop-/r/ (i.e., /pr/,
/br, /tr/, /dr/, /kr/, gr/) is acquired later than other types of word-initial
consonant clusters when frequency of words is held constant (Ota and
Green 2013). The implication is that the impact of lexical frequency on
phonological development may vary in degree across different sounds or
sound patterns.
This brings us to a third issue. The lexical frequency effects discussed above
must be distinguished from the frequency effects of phonological elements and
structures. It has been reported that the accuracy or acquisition order of different
sounds and sound patterns is related to their relative frequency in the input. For
example, the proportion of different consonantal sounds found in the babbling
of English-, French-, Japanese-, and Swedish-exposed children reects cross-
linguistic differences in the relative frequency of consonants in the ambient
adult languages (Boysson-Bardies and Vihman 1991). The order in which
syllable-coda segments or consonant clusters are produced accurately in
English-learning children is closely related to the input frequency order (or
phonotactic probability) of those structures (Beckman and Edwards 2000;
Edwards, Beckman, and Munson 2004; Munson 2001; Zamuner, Gerken, and
Hammond 2004, 2005). A similar structure-frequency relationship is observed
between the order of CV structure acquisition (CV > CVC > V/VC) and the
structures input frequency in Dutch (Levelt, Schiller, and Levelt 1999/2000).
These ndings suggest that more frequently encountered sounds or sound
patterns (say, for example, coda /t/ in English) are acquired earlier than less
frequently encountered ones (e.g., coda /b/), but they do not necessarily imply
that a sound or sound pattern is acquired earlier in more frequently encountered
lexical items (e.g., /t/ in cat) than in less frequently encountered ones (e.g., /t/ in
hat). Although these two types of input frequency the frequency of lexical
items and the frequency of phonological targets are closely related to each
other, they need to be considered separately in the context of phonological
development. The focus here is the token frequency of lexical items (i.e., how
many times a child hears a particular word), not the token or type frequency of
phonological targets (i.e., how many times a child hears a particular sound or
sound pattern, or howmany different words a child hears containing a particular
sound or sound pattern).
Lexical frequency effects on phonological development 417
The purpose of this chapter is to present some evidence that the development
of phonological production involves both lexical diffusion and phonological
conditioning. The point at which children become able to produce a given
phonological form depends on the particular word that contains it more
specically, on how often they hear that word. But the magnitude of this effect
may differ across the phonological forms under development. The particular
case that will be examined is the production of words with more than one
syllable in Japanese. Across languages, children around the age of 1 to 3 years
tend to omit syllables from their production of multisyllabic target words
(Macken 1979; Kehoe 1999/2000; Fikkert 1994; Pye 1992; although see
Vihman 1991; Lle and Demuth 1999; Savinainen-Makkonen 2000 for cross-
linguistic differences in the degree to which this applies). Such syllable omis-
sion or truncation occurs more frequently when target words are long or of a
particular shape. In English, for example, trisyllabic words are generally more
prone to syllable omission than disyllabic words, and, among disyllabic words,
those with nal stress (e.g., giraffe) are more likely to undergo truncation than
those with initial stress (e.g., rabbit) (Allen and Hawkins 1980; Echols and
Newport 1992; Holmes 1927; Salidis and Johnson 1997). Similarly, in
Japanese, truncation in childrens word production is observed more in trisyl-
labic than disyllabic targets and, among disyllabic words, (C)V.CVC or
(C)V.CVV words are truncated more often (e.g., /toke/ [ke] clock) than
(C)VC.CVor (C)VV.CV words (e.g., /panda/ [pada] panda) (Ota 2003). If
lexical variation in phonological development is conditioned by input fre-
quency and phonological forms, we expect truncation of individual words to
decrease as a function of the input lexical frequency as long as we compare
words of the same size and shape. At the same time, the different truncation
rates across patterns may not be reducible to frequency effects of individual
lexical items. That is, we do not expect to nd a monolithic correlation between
truncation rate and input frequency of all lexical items (regardless of their
phonological structures). These predictions are examined in the spontaneous
word production of Japanese-learning children between the age of 1 and 2.
2. Data
The quantitative analysis presented in this chapter is largely reproduced from
Ota (2006), which is based on Miyatas (1992, 1995, 2000) spontaneous speech
corpus of three children acquiring Japanese: Aki, Ryo, and Tai. Additional
examples presented in Appendix A are drawn from three other children, studied
in Ota (2003): Hiromi, Kenta, and Takeru. Both corpora are accessible from the
CHILDES database (MacWhinney 2000). Recordings for the Miyata corpus
began when the child was either 1;4 (Ryo ) or 1;5 (Aki and Tai), and were
carried out weekly until the child reached 3;0 (Aki and Ryo) or 3;1 (Tai), except
for Akis rst six sessions, which were held monthly. The childrens utterances
were transcribed in a phonemic system (JCHAT) in combination with a broad
418 Mitsuhiko Ota
phonetic system (UNIBET), particularly for productions with noticeable devia-
tion from the adult target forms. In the examples below as well as in the
Appendices, these were all converted to IPA notations.
In examining the effects of the phonological makeup of the target words,
focus was placed on the global size and shape of the word, such as the number of
syllables and the prosodic features of the syllables. For this reason, target words
including specic segments that are independently known to be frequently
omitted in childrens production were excluded from the quantitative analysis.
These include devoiced vowels (e.g., /i koki / airplane) and the ap //
between two identical or similar vowels (e.g., /kma/ car). Another
group of items excluded were onomatopoeic expressions, as it was difcult to
establish the number of syllables in their intended targets (e.g., [ba ba] bang
bang; target /ba ba ba/ or /ba ba/?).
1
The remaining target words were coded for number of syllables, syllable
weight, and pitch accent. The classication of syllable weight was based on
phonological and morphological evidence from adult Japanese. Syllables con-
taining a long vowel (e.g., /zo/), a diphthong (e.g., /kai/), or a coda consonant
(e.g., /pa/), including the rst half of a geminate (e.g., the rst syllable in
/mot.to/), were classied as heavy (or bimoraic). All other syllables, that is, ones
that have a short vowel but not a coda (e.g., /te/), were considered light (or
monomoraic). Japanese-speaking children show sensitivity to this distinction in
their early word production, and they tend to retain the weight of syllables even
when segments are deleted or shortened (e.g., /pegi/ [pip.pi] penguin,
/keki/ [kik.ki] cake) (Ota 2003, 2006). The target words were also coded
for the presence and location of pitch accent. In all subsequent examples in this
chapter (including those in the Appendix), the accented syllable is marked by an
acute accent diacritic placed above the vowel. A pitch accent in Japanese is the
location in a word where a high to low pitch movement occurs. As its name
suggests, the main phonetic manifestation of a pitch accent is pitch, and unlike
lexical stress in languages such as English, it does not usually affect the duration
or the vowel quality of the accented syllable. Another characteristic of pitch
accent in Japanese is that any lexical item has either one accent, which is
assigned to a specic syllable, or no accent at all. Unaccented words can carry
pitch movements of the phrase or utterance they are in, but lack any lexically
specic contours. Pitch accent plays a role in early word production such that
an accented syllable tends to resist omission (e.g., /obta/ [ba] grandma,
/pama/ [pama] pajamas) (Ota 2003, 2006).
Although the Ota (2006) study excluded target words that were attempted
fewer than ve times within each time period, no such restriction was imposed
on the analysis here. This meant that lexical items with only a fewattempts were
included in the analysis, but the overall results do not differ from the original
study except for a slight increase in statistical power. Interested readers are
referred to the method and results section of Ota (2006).
3. Analysis
To get a sense of how often Japanese-learning children omit a syllable in
producing words, let us rst look at the overall rate at which such truncation
occurs before age two. Figure 15.1 conrms what has already been robustly
observed: Initially, children do not target many words with more than two
syllables, and when they do, they tend to truncate them in production.
Some examples of syllable omission in target words with three or more
syllables are given in Table 15.1. In order to adjust for the different rates of
lexical production among the children, the examples are taken from the rst
month in which the child produced at least 15 words spontaneously in a half-hour
100.0
1;051;07 1;081;10 1;112;01
1;051;07 1;081;10 1;112;01
Aki
Tai
2
3
4(+)
Age
Syllables
Age
Ryo
T
r
u
n
c
a
t
i
o
n

(
%
)
80.0
60.0
40.0
20.0
.0
100.0
T
r
u
n
c
a
t
i
o
n

(
%
)
80.0
60.0
40.0
20.0
.0
Figure 15.1. Percentage of truncation (omission of syllables) in word
production by Aki, Ryo, and Tai, organized by age (year; month) and the
number of syllables in the target word (2, 3, and 4 or more). Missing values
indicate no relevant target words
420 Mitsuhiko Ota
recording session (the 15-word point). A full list of words produced by ve of
the children at the 15-word point is presented in Appendix A.
Figure 15.1 also shows that the rate of truncation decreases over time, but it is
always higher for longer words. It is possible that this effect reects the length
distribution of the words children hear in the input, as illustrated in Figure 15.2.
For words of two or more syllables, the relative frequency (both token and type)
of words in maternal speech decreases rapidly as a function of the length of the
words.
This type of correlation between production accuracy and input frequency
can be taken as evidence that childrens ability to produce words with a certain
type of phonological structure (e.g., trisyllabic) is related to the amount of
exposure they have to the structure. But it does not demonstrate that their ability
to produce individual lexical items is related to exposure to those specic
words. To examine the latter, we need to test whether the production accuracy
(or truncation rate, in this case) of individual words is correlated with their input
frequency. However, an overall analysis of data from the three children in the
Miyata corpus reveals no signicant correlations between the truncation rates of
target words and their lexical frequencies in maternal speech.
A closer look at the data reveals why this is the case. Figures 15.3 and 15.4
display the relationship between truncation rates in word production in Tais
speech between 1;5 and 1;7 and lexical frequency in his mothers speech. In
both the disyllabic and trisyllabic data, it is evident that there are two groups of
words. At the top two-thirds of the panel are words that undergo some trunca-
tion, words whose rate of truncation is apparently related to their input fre-
quency. At the bottomof each panel, however, are words with no observed cases
of syllable omission even though these words vary in their frequency as much as
the words in the rst group do.
To better understand the source of this difference, let us inspect the phono-
logical shape of the target words. Table 15.2 presents the truncation rates of
words with different target phonological structures, categorized according to the
three prosodic parameters described in the previous section: (1) number of
Table 15.1. Truncated productions of words with three or more syllables
Target Gloss Child form Child (age)
a. /io/ back [tio] Aki (1;8)
b. /atam/ head [apa] Hiromi (1;3)
c. /oki/ sweets [k], [ga] Takeru (1;7)
d. /tonne/ tunnel [ninne], [nenne] Kenta (2;0)
e. /bnana/ banana [nana] Ryo (1;10)
f. /iigo/ strawberry [no], [o] Ryo (1;10)
g. /pama/ pajamas [pama] Ryo (1;10)
h. /imama/ zebra [ima], [ma], [ma] Ryo (1;10)
syllables, (2) syllable weight, and (3) the presence/position of pitch accent. The
abbreviations in the leftmost column of Table 15.2 represent all this information
in a condensed form, with L and H indicating light and heavy syllables,
respectively, and the number indicating the location of the syllable that bears the
pitch accent (0 indicates an unaccented word). To exemplify, Appendix B
presents a subset of the data on which the analysis is based, all words produced
by Tai during his session at 1;6.4. The analysis was carried out separately for
each of three periods, each spanning three months: 1;51;7, 1;81;10, and 1;11
2;1. The table omits the rst two periods for Aki and Ryo because neither child
targeted more than two prosodic word types for trisyllabic words, and Tais last
period is omitted because the truncation rate for that period was extremely low
(1.2%). Altogether, Akis target words fell into 42 categories, Ryos into 33
categories, and Tais into 38 categories. Table 15.2 shows some of the most
frequently targeted categories.
It can be seen from this table that truncation rates vary quite dramatically
even when we look separately at disyllabic and trisyllabic targets. The differ-
ence across prosodic structures is signicant for Akis and Ryos disyllabic
50.0
40.0
30.0
20.0
P
e
r
c
e
n
t
a
g
e
Syllables
10.0
.0
2 1 4 3 5 6
Token
Type
Frequency
Figure 15.2. Token and type word frequency in the maternal speech directed
at Aki, Ryo, and Tai, organized by the number of syllables in each word
422 Mitsuhiko Ota
targets (
2
(11) = 20.31, p < .05 and
2
(11) = 68.23, p < .001, resp.), and Tais
disyllabic (
2
(11) = 354.15, p < .001) and trisyllabic targets (
2
(20) = 44.75, p <
.01) at 1;51;7. The difference also approached signicance in Ryos trisyllabic
targets (
2
(20) = 29.21, p = .08).
While disyllabic targets generally have very low truncation rates, there are
some exceptions. For example, Ryos LH2 targets were truncated over 15
percent of the time. Four of the 10 word types belonging to this category were
truncated. These are shown in Table 15.3 (ad). In contrast, no truncation was
observed in Ryos data for LL2 words (52 tokens from 12 word types), LH0
words (16 tokens from 4 word types) or LH1 words (21 tokens from 5 word
types). Tais LH0 and LH2 targets at 1;51;7 were truncated 77.8 and 7.6
percent of the time, respectively. Of the 11 word types classied into these
structures, 5 were truncated (see Table 15.3 (ei)). Again, this contrasts with the
extremely rare truncation in his LL0 targets (only 1 occurrence out of 480
tokens from 10 word types) or the total lack of truncation in his production of
LH1 targets (none in 30 tokens from 9 word types).
2.00
.00
L
o
g

t
r
u
n
c
a
t
i
o
n

r
a
t
e
Log lexical frequency in maternal speech
4.00
6.00
6.00 7.00 5.00 4.00
Figure 15.3. Tais production of disyllabic targets in 1;51;7. Each circle
represents a target word plotted against its log-transformed proportional
frequency in his mothers speech (X axis; the higher the number, the higher
the frequency) and its log-transformed rate of truncation in the childs
production (Y-axis; the higher the number, the higher the rate).
Despite variation in pitch patterns, the disyllabic words that are more likely to
truncate have the shape LHacross children. It is not the case that LHwords have
an overall lower frequency in child-directed speech (CDS) (Ota 2006). But it
has been observed that they are conspicuously missing in CDS-specic lexical
items (i.e., the equivalents of words such as tummy and choo-choo) in
Japanese (e.g., /nenne/ sleep, /ajo/ foot, /pompo/ tummy, /poppo/
choo-choo) (Kubozono 2003). It may be that the accuracy of child production
of LH structures lags behind because these structures are underrepresented in
words that are central to childrens interaction with caregivers.
We also see exceptions to the generalization that longer words are more
difcult than shorter words. There are some trisyllabic targets that appear to
cause no production problems for the children. Ryo never omitted syllables
from his LLL2 trisyllabic targets (out of 43 tokens for 8 word types including
/omta/ toy, /ha/ run, /tamgo/ egg). Tai never omitted syllables from
his LLH2 trisyllabic targets (out of 54 tokens for 5 word types including
/nomnai/ dont drink, /patka/ police car, and /odky/ (name of a train
line)). It is difcult to nd commonalities among this collection of structures,
2.00
.00
L
o
g

t
r
u
n
c
a
t
i
o
n

r
a
t
e
4.00
6.00
6.00 7.00 5.00 4.00
Figure 15.4. Tais production of trisyllabic targets in 1;51;7. Each circle
represents a target word plotted against its log-transformed proportional
frequency in his mothers speech (X axis) and its log-transformed rate of
truncation in the childs production (Y-axis).
424 Mitsuhiko Ota
but one possibility is that they are connected to the development of morpholo-
gically complex forms. For instance, by 1;11 Ryo has acquired the nonpast verbal
ending -//. When -// is attached to a CVCV verb stem, it results in LLL2
(e.g., /ha/ run, /tab/ eat), which Ryo produces. Similarly, by 1;8
Tai has acquired -/nai/, the negative verb sufx, which forms an LLH2
structure when attached to a CVCV verb stem (e.g., /nomnai/ not drink,
Table 15.2. Truncation rates (%) for selected prosodic structures
Aki Ryo Tai
Structure 1;112;1 1;112;1 1;51;7 1;81;10
LL0 2.3 (3/131) 0.5 (1/203) 0.4 (2/485) 0.5 (5/976)
LL1 4.0 (4/101) 0.0 (0/348) 0.6 (1/169) 1.6 (4/243)
LL2 3.5 (2/58) 0.0 (0/52) 0.0 (0/30) 0.0 (0/64)
LH0 14.3 (1/7) 0.0 (0/16) 77.8 (21/27) 0.0 (0/84)
LH2 0.0 (0/12) 15.2 (12/79) 7.6 (11/145) 0.0 (0/113)
HL1 0.8 (1/124) 3.4 (7/209) 1.5 (7/457) 0.4 (2/507)
HH0 8.7 (2/23) 0.0 (0/10) 0.7 (1/153) 0.9 (1/112)
HH1 4.0 (4/99) 4.0 (4/100) 0.7 (1/153) 0.0 (0/97)
HL2 12.1 (8/66) 1.6 (2/127) 1.1 (3/281) 1.0 (4/402)
LLL1 66.7 (4/6) 18.8 (6/32) 65.4 (34/52) 11.4 (4/35)
LLL2 100.0 (1/1) 0.0 (0/43) 16.7 (2/12) 5.6 (1/18)
LLH2 100.0 (3/3) 33.3 (3/9) 0.0 (0/25) 0.0 (0/29)
LHL0 38.9 (7/18) 100.0 (1/1) 0.0 (0/35)
HLH2 100.0 (12/12) 5.3 (1/19) 0.0 (0/4) 0.0 (0/4)
HHL2 80.0 (4/5) 11.1 (6/54) 0.0 (0/1) 10.0 (4/40)
HHH1 58.3 (7/12) 0.0 (0/1) 3.2 (2/62) 3.2 (2/62)
HHH2 5.3 (2/38) 0.0 (0/34) 86.7 (13/15) 0.0 (0/11)
Note: Numbers in brackets are token counts. L stands for a light syllable, and H for a
heavy syllable. 0 denotes no accent, 1 a pitch accent on the rst syllable, and 2 a
pitch accent on the second syllable.
Table 15.3. Truncated productions of LH targets
Target Gloss Child form Child (age)
a. /omi/ heavy [moi] Ryo (2:0)
b. /kaj
i/ itchy [ji] Ryo (2;1)

c. /iti/ hurting [tai] Ryo (2;02;1)
d. /sgi/ great [oi] Ryo (2;02;1)
e. /bdo/ grape [b] Tai (1;5)
f. /toke/ clock [ke] Tai (1;51;6)
g. /iti/ hurting [tai] Tai (1;51;6)
h. /taki/ high [tai] Tai (1;6)
i. /sgi/ great [goi] Tai (1;7)
/tabnai/ not eat). It may be that such complex forms are internally analyzed or
more easily remembered because of the common ending, allowing children to
produce them more accurately than morphologically simpler trisyllabic forms.
That in turn may promote the production of monomorphemic forms of the same
phonological structure. Whatever the reasons turn out to be, it is clear that the
accurate production of words can be inuenced not only by the length of the word
(measured in number of syllables) but also by the shape of the word (charac-
terized in terms of syllable weight and pitch accent).
Apart from the observation that some short words tend to truncate more
frequently than others and some long words tend to truncate less frequently
than others, there is additional evidence that syllable omission is induced
partly by inherent levels of difculty in producing certain phonological
structures. For words with certain phonological shapes, there is a systematic
correspondence between the target and its truncated form. Thus, as is evident
from the examples in Table 15.3, when truncated, LH targets are almost
invariably of the form H with the initial syllable lost. Truncation of LHL2
most frequently results in HL (in which the syllables produced are not
necessarily the second and third syllables of the adult form, however), as
exemplied by these examples: /tokk/ [takk] truck (Aki, 2;1.24),
/tatita/ [taita] hit (Ryo, 2;1.25), /dida/ [da] car (Tai, 1;8.28),
[da] (Kenta, 1;7.16) (see Ota 2003 and 2006 for other frequent correspond-
ence patterns). It is interesting to note that these mappings are reminiscent of
those found in other languages. In English and Dutch, for example, there is a
strong tendency for wS targets (i.e., disyllables with a weak/unstressed initial
syllable and a strong/stressed nal syllable) to truncate to S ([rf] giraffe) and
wSw targets to Sw ([teto] potato) (Allen and Hawkins 1980; Echols and
Newport 1992; Fikkert 1994; Wijnen, Krikhaar, and den Os 1994). These
cross-linguistic similarities bolster the argument that variation in truncation
rates across words reects, at least in part, constraints imposed by the phono-
logical structure of the target words.
This does not mean that lexical frequency plays no role in the accuracy of
production. When words which never undergo syllable omission are excluded
from the analysis, signicant correlations are found between the lexical input
frequency and truncation rates of disyllabic targets in Akis production between
1;11 and 2;1 (r = .845, n = 10, p < .01), Ryos production between 1;11
and 2;1 (r = .886, n = 13, p < .001), and Tais production between 1;5 and 1;7
(r = .798, n = 18, p < .001), and 1;8 and 1;10 (r = .917, n = 12, p < .01).
Correlations between lexical frequency and truncation rate are also found for
trisyllabic targets in Ryos production between 1;11 and 2;1 (r = .810, n = 15,
p < .001), and in Tais production between 1;5 and 1;7 (r = .852, n = 11,
p < .001), and 1;8 and 1;10 (r = .680, n = 20, p < .001).
2
These correlations in
Tais data for the period 1;51;7 are illustrated in Figure 15.5 (disyllabic targets)
and Figure 15.6 (trisyllabic targets), and for the period 1;81;10 in Figure 15.7
(disyllabic targets) and Figure 15.8 (trisyllabic targets). What these gures
426 Mitsuhiko Ota
show is that the overall rate of truncation decreases over time, but at any given
time during these periods, words heard more frequently in the input tend to have
lower truncation rates. In Figure 15.5, for example, we see three target words
with the disyllabic structure LH2 that Tai attempted at 1;51;7: /taki/ high,
/iti/ hurts, and /aki/ red. The latter two are more frequent in the input and
lower in truncation rate than the rst. Similarly, in Figure 15.6, the two
trisyllabic HHH2 words, /ampmma/ (name of cartoon character) and
/hanbga/ hamburger stand in the same frequencytruncation relationship.
In the next stage (i.e., 1;81;10), however, none of these words can be seen
along the regression lines represented in Figures 15.7 and 15.8. In fact by this
stage all LH2 and HHH2 words in the data exhibit zero truncation.
The picture that emerges from this analysis is one in which words with a
particular phonological structure become gradually amenable to adultlike pro-
duction, beginning with the most frequent lexical items. Figures 15.515.8
offer snapshots of this process in disyllabic and trisyllabic words. However, this
tendency is conditioned by the phonological structure of the target word. Words
that are shorter and have certain prosodic proles are easier to produce, and do
not always exhibit any detectable inuence of input lexical frequency. The
implications of these observations will be discussed in the following section.
1.00
1.00
.00
tokee totta budoo
suika
takai
buubu
maru
ikkai
ita
ippai
itai
aka
atta
akai
chiizu
kotchi
doite
L
o
g

t
r
u
n
c
a
t
i
o
n

r
a
t
e
chiitaa
3.00
4.00
5.00
7.00
2.00
6.50
6.00 5.50 5.00 4.50
Figure 15.5. Tais production of disyllabic targets with some truncation in
1;51;7
4. Discussion
The purpose of the analysis in Section 3 was to test the two predictions that
follow from the idea that changes in phonological development spread from the
words that are most frequent in the input to those that are less frequent, to the
extent that words with the same phonological makeup are compared. To
measure the degree of acquisition, truncation was used as an approximate
inverse indicator of childrens ability to produce a target form faithfully. The
main prediction was that there should be a negative correlation between the
truncation rates of individual words and their frequency in maternal speech.
Another prediction was that there would be systematic differences in truncation
rates across phonological structures which cannot be reduced to lexical fre-
quency differences.
Both of these predictions were supported by the spontaneous speech data
from Japanese-speaking children analyzed above. The words children attemp-
ted to produce behaved in two different ways. Some of them, with prosodic
proles that varied from one child to the next, showed a signicant negative
correlation between their truncation and input frequencies. For these structures,
children were more accurate in producing frequent words. Words with other
Taishookun
midori
.50
.00
.50
1.00
1.50
2.00
7.00 6.50 6.00 5.50 5.00 4.50
hanbaagaa
ichigoootobai
banana
otete
L
o
g

t
r
u
n
c
a
t
i
o
n

r
a
t
e
ampamman
okaeri
tadalma
ohayoo
Figure 15.6. Tais production of trisyllabic targets with some truncation in
1;51;7
428 Mitsuhiko Ota
structures were never truncated, irrespective of their input frequencies. The
difference between these two types of words is not due to the frequency of the
individual words in each group, for otherwise, a single correlation between
truncation and frequency would have been obtained from the data set as a
whole. Rather, these ndings indicate that the developmental timeline of word
production can differ from one type of phonological structure to another, but
within each word structure type, changes affect individual lexical items in a
systematic way according to their input frequencies. Such frequency effects
disappear once a word structure becomes readily available to the learner and no
longer induces syllable omission.
These ndings highlight two important aspects of the role of the lexicon in
phonological development. First, they demonstrate that the cross-word varia-
tion in sound production reported in previous research is systematically related
to lexical frequency. Like diachronic sound change, developmental sound
change is also sensitive to howoften individual words are heard. Second, lexical
frequency effects on phonological development are conditioned by the sounds
and sound patterns that make up the word. The inuence of word frequency is
not simply a matter of familiarization with the lexical items, but a complex
process that arises from the interaction between lexical learning and the nature
of the sound form in development.
1.00
.00
1.00
2.00
3.00
4.00
5.00
7.00 6.50 6.00 5.50 5.00 4.50 4.00
koohii
deta
doite
hako
L
o
g

t
r
u
n
c
a
t
i
o
n

r
a
t
e
eki
ippai
issho
yatte
kotchi
koko
atta
chiizu
Figure 15.7. Tais production of disyllabic targets with some truncation in
1;81;10
There are at least two possible ways in which these types of factors might
interact in the development of production. One is that lexical factors (such as
lexical input frequency) and phonological factors (such as the target segment or
word structure) have continuous and independent inuences on word produc-
tion. Frequent words are produced more accurately than infrequent ones, and at
the same time, inherently easier sounds or sound patterns are produced more
accurately than difcult ones. As children become generally better at producing
words, lexical differences in the production accuracy of easier phonological
targets become increasingly smaller in magnitude, making the frequency effects
much less easily detectable. The second possibility is that lexical and phono-
logical factors interact in a more dynamic manner during development. When a
particular phonological aspect is mastered in a certain number of lexical items,
the learning may generalize to other words containing the same phonological
element or pattern such that subsequent learning is speeded up, minimizing the
impact of lexical factors such as input frequency on learning time. Such a
developmental interaction has been observed in the acquisition of syntactic
patterns (Ninio 1999; Keren-Portnoy 2006) and may also apply to phonological
acquisition. Under this interpretation, the zero-truncation words in this study
belong to categories of phonological structures that have reached the acquisition
jidoosha
toreta
donguri
kuruma shooboosha
hakobu
L
o
g

t
r
u
n
c
a
t
i
o
n

r
a
t
e
7.00
3.00
2.00
1.00
.00
6.50 6.00 5.50 5.00 4.00 4.50
shuppoppo
Taishookun
chuushajoo
oriru
hirotta
irasshai
kyuukyuusha
hayaku moratte
asoko ireta
nimotsu
abunaihommmono
Figure 15.8. Tais production of trisyllabic targets with some truncation in
1;81;10
430 Mitsuhiko Ota
point at which the time it takes for new words to be produced accurately is
below the analysis threshold. It is not possible to differentiate these two
scenarios based on the data in the current study because the sampling is not
sufciently frequent to track the rate of development of individual words, and
the linguistic analysis using syllable truncation is too coarse to detect subtle
changes over time. There is some evidence from consonant cluster production
that the time it takes to reach target-like phonological production decreases for
words that enter the lexicon later (Ota and Green 2013), but it is unclear whether
the reduction of learning time is or is not accelerated once the clusters are
learned in a critical mass of lexical items. Analysis of ner-grained phonetic
data (e.g., voice onset time in voiceless vs. voiced consonants) with a better time
resolution is probably required to shed some light on this issue.
A point that warrants emphasis is that this study examined just one aspect of
phonological development, and it remains to be seen if the same pattern of
interaction between lexical frequency and phonological forms applies to other
areas. As pointed out by Macken (1992), there may be developmental changes
in phonology that occur across-the-board as well as some that spread gradually
through the childs lexicon. If such a distinction is qualitative (as opposed to
being a matter of degree), a question that needs further investigation is what
characterizes the aspects of phonological development that are lexically and
gradually diffused. Even when there is lexical diffusion, the factors behind the
different timing of phonological development across words may vary from one
aspect of the sound system to another. In a recent study investigating the overall
production of segments at 2;0 and 2;5, Sosa and Stoel-Gammon (2012) found
effects of age of word acquisition (i.e., when the word entered the childs
lexicon) and phonological neighborhood density (i.e., the number of similar-
sounding words), but not of lexical frequency, on the accuracy of production. In
contrast, in examining the production accuracy of word-initial clusters between
1;0 and 3;0, Ota and Green (2013) found effects of all three factors, including
lexical input frequency. Although differences in methodology and age range
make the direct comparison of these studies difcult, they suggest that different
aspects of phonological development may be affected by distinct sets of lexical
factors.
The conditioning role of phonological form also requires further exploration.
Although the analysis presented here employed prosodic units such as syllable
weight that are abstracted away from segmental details, the phonological forms
that condition lexical diffusion may be individual word templates specied for
segmental and articulatory features (e.g., Macken 1979; Waterson 1971). One
intriguing outcome of this study is that the structures of words that did or did not
exhibit syllable omission differed from one child to the next. While this type of
variation may also be a result of differences in the structural frequencies of the
relevant phonological forms (e.g., Tai may have heard relatively more LLH2
structures in his input), cross-linguistic research on motherchild speech sug-
gests that such differences are more likely to be due to individual characteristics
of the learners current linguistic state (Vihman, Kay, de Boysson-Bardies,
Durand, and Sundberg 1994).
The production data from Japanese analyzed in this chapter provides evi-
dence that phonological development that leads to the reduction of syllable
omission occurs rst in the words that children are exposed to most frequently.
However, input frequency is only one of the potential factors that characterize
lexical diffusion of phonological development. Another guiding factor that
has come up in the literature is neighborhood density, the number of words in
the lexicon that are phonologically similar to the target word (Gierut 2001).
Based on ndings in (adult) psycholinguistic research that words with fewer
neighbors are easier to recognize, it has been suggested, although not veried
in data with typically developing young children, that phonological develop-
ment is also more likely to originate in low-density neighborhoods (Gierut,
Morrisette, and Champion 1999). Another factor that needs further investiga-
tion is the timing of lexical learning. Although there is a general tendency for
words that are acquired earlier to be more frequent in the input (Goodman, Dale,
and Li 2008) and also more accurately produced (Garlock, Walley, and Metsala
2001), some early acquired items can be less responsive to the updating of the
phonology (Menn and Matthei 1992; Moskowitz 1980). Truncation of early-
acquired words, for example, may persist even after other similar-shaped words
cease to truncate (Moskowitz 1980; Johnson, Lewis, and Hogan 1997). If these
words are not infrequent, phonological development may be more likely to
affect recently acquired lexical items (other things being equal). Future inves-
tigations into the mechanisms responsible for the generalization of phonological
patterns across the lexicon need to address not only lexical frequency but also
such factors as neighborhood density and the timing of lexical learning.
notes
1. The unassimilated syllable-nal nasal in Japanese is transcribed as dorso-uvular [],
although the closure is often incomplete (Okada 1999; Vance 2008).
2. In both age bins for Tai, there is an outlier: Taishookun (/taiok/). This is the
childs name (Taishoo) with a sufx for boys names (kun). Because of this trans-
parent morphological structure, it is possible that this word is treated as a combination
of disyllabic and monosyllabic forms. A reanalysis without this word fails to reach
signicance in the 1;51;7 data (r = -.556, n = 10, p = .095) but shows a signicant
negative correlation in the 1;81;10 data (r = -.615, n = 19, p < .005).
References
Allen, G. D. and Hawkins, S. (1980). Phonological rhythm: denition and development.
In G. H. Yeni-Komshian, J. F. Kavanagh, and C. A. Ferguson (eds.), Child phonol-
ogy, vol. I, pp. 22756. New York: Academic Press.
Beckman, M. E. and Edwards, J. (2000). Lexical frequency effects on young childrens
imitative productions. In M. Broe and J. Pierrehumbert (eds.), Papers in laboratory
phonology VI, pp. 20717. Cambridge University Press.
432 Mitsuhiko Ota
Berg, T. (1995). Sound change in child language: a study of inter-word variation.
Echols, C. and Newport, E. (1992). The role of stress and position in determining rst
words. Language Acquisition, 2, 189220.
Edwards, J., Beckman, M., and Munson, B. (2004). The interaction between vocabulary size
and phonotactic probability effects on childrens production accuracy and uency in
nonword repetition. Journal of Speech, Language and Hearing Research, 47, 42136.
of Leiden (HIL dissertations 6). The Hague: Holland Academic Graphics.
Garlock, V., Walley, A. C., and Metsala, J. L. (2001). Age-of-acquisition, word fre-
quency and neighborhood density effects on spoken word recognition by children
and adults. Journal of Memory and Language, 45, 46892.
Gierut, J. A. (2001). A model of lexical diffusion in phonological acquisition. Clinical
Linguistics and Phonetics, 15, 1922.
Gierut, J. A., Morrisette, M. L., and Champion, A. H. (1999). Lexical constraints in
language acquisition. Journal of Child Language, 26, 26194.
Goodman, J. C., Dale, P. S., and Li, P. (2008). Does frequency count? Parental input and
the acquisition of vocabulary. Journal of Child Language, 35, 51531.
Hodson, B. W. and Paden, E. P. (1981). Phonological processes which characterize
unintelligible and intelligible speech in early childhood. Journal of speech and
Hearing Disorders, 46, 36973.
Holmes, U. T. (1927). The phonology of an English-speaking child. American Speech, 2,
21925.
Hsieh, H.-I. (1972). Lexical diffusion: evidence fromchild language acquisition. Glossa,
6, 89104.
Johnson, J. S., Lewis, L. B., and Hogan, J. (1997). A production limitation in syllable
number: a longitudinal study of one childs early vocabulary. Journal of Child
Kehoe, M. (1999/2000). Truncation without shape constraints: the late stages of prosodic
acquisition. Language Acquisition, 8, 2367.
Keren-Portnoy, T. (2006). Facilitation and practice in verb acquisition. Journal of Child
Language, 33, 487518.
Keren-Portnoy, T., Vihman, M., DePaolis, R., Whitaker, C. J., and Williams, N. M.
Journal of Speech, Hearing, and Language Research, 53, 128093.
Kubozono, H. (2003). The syllable as a unit of prosodic organization in Japanese. In
C. Fery and R. van der Vijver (eds.), The syllable in Optimality Theory, pp. 99122.
Leonard, L. B. and Ritterman, S. I. (1971). Articulation of /s/ as a function of cluster and
word frequency of occurrence. Journal of Speech and Hearing Research, 14, 47685.
Levelt, C. C., Schiller, N. O., and Levelt, W. J. (1999/2000). The acquisition of syllable
types. Language Acquisition, 8, 23764.
Lle, C. and K. Demuth. (1999). Prosodic constraints on the emergence of grammatical
morphemes: crosslinguistic evidence from Germanic and Romance languages. In
A. Greenhill, H. Littleeld, and C. Tano (eds.), Proceedings of the 23rd Annual
Boston University Conference on Language Development, pp. 40718. Somerville,
MA: Cascadilla Press.
Macken, M. (1979). Developmental reorganization of phonology: a hierarchy of basic
(1992). Wheres phonology? In C. A. Ferguson, L. Menn, and C. Stoel-Gammon,
MacWhinney, B. (2000). The CHILDES Project: tools for analyzing talk, 3rd edn.
Mahwah, NJ: Lawrence Erlbaum.
Miyata, S. (1992). Wh-questions of the third kind: the strange use of wa-questions in
Japanese children. Bulletin of Aichi Shukutoku Junior College, 31, 1515.
(1995). The Aki corpus: longitudinal speech data of a Japanese boy aged 1.62.12.
Bulletin of Aichi Shukutoku Junior College, 34, 18391.
(2000). The Tai corpus: longitudinal speech data of a Japanese boy aged 1;5.203;1.1.
Bulletin of Aichi Shukutoku Junior College, 39, 7785.
Morrisette, M. L. and Gierut, J. A. (2002). Lexical organization and phonological change
in treatment. Journal of Speech, Language and Hearing Research, 45, 14359.
Moskowitz, A. I. (1970). The two-year-old stage in the acquisition of English phonology.
Moskowitz, B. A. (1980). Idioms in phonology acquisition and phonological change.
Journal of Phonetics, 8, 6983.
Munson, B. (2001). Phonological pattern frequency and speech production in adults and
children. Journal of Speech, Language and Hearing Research, 44, 77892.
Ninio, A. (1999). Pathbreaking verbs in syntactic development and the question of
prototypical transitivity. Journal of Child Language, 26, 61953.
Okada, H. (1999). Japanese. In the International Phonetic Association (ed.), Handbook
of the International Phonetic Association, pp. 11719. Cambridge University Press.
Ota, M. (2003). The development of prosodic structure in early words. Amsterdam: John
Benjamins.
(2006). Input frequency and word truncation in child Japanese: structural and lexical
effects. Language and Speech, 49, 26194.
Ota, M. and Green, S. J. (2013). Input frequency and lexical variability in phonological
development: a survival analysis of word-initial cluster production. Journal of
Phillips, B. S. (2006). Word frequency and lexical diffusion. New York: Palgrave.
Pye, C. (1992). The acquisition of Kiche (Maya). In D. Slobin (ed.), The crosslinguistic
study of language acquisition, vol. 3, pp. 221308. Hillsdale, NJ: Lawrence Erlbaum.
Salidis, J. and Johnson, J. (1997). The production of minimal words: a longitudinal case
study of phonological development. Language Acquisition, 6, 136.
Savinainen-Makkonen, T. (2000). Word-initial consonant omissions a developmental
process in children learning Finnish. First Language, 20, 16185.
Schwartz, R. G. and Terrell, B. Y. (1983). The role of input frequency in lexical acquis-
ition. Journal of Child Language Acquisition, 10, 5764.
Sosa, A. V. and Stoel-Gammon, C. (2012). Lexical and phonological effects in early word
production. Journal of Speech, Language and Hearing Research, 55, 596608.
434 Mitsuhiko Ota
Swingley, D. (2007). Lexical exposure and word-form encoding in 1.5-year-olds.
Tyler, A. A. and Edwards, M. L. (1993). Lexical acquisition and acquisition of initial
voiceless stops. Journal of Child Language, 20, 25373.
Vance, T. J. (2008). The sounds of Japanese. Cambridge University Press.
Vihman, M. M. (1991). Ontogeny of phonetic gestures: speech production. In
I. G. Mattingly and M. Studdert-Kennedy (eds.), Modularity and motor theory of
speech perception, pp. 6984. Hillsdale, NJ: Lawrence Erlbaum.
Blackwell.
Vihman, M. M., Kay, E., de Boysson-Bardies, B., Durand, C., and Sundberg, U. (1994).
External sources of individual differences? Across-linguistic analysis of the phonetics
of mothers speech to one-year-old children. Developmental Psychology, 30, 65162.
Vihman, M. M. and Velleman, S. (2000). Phonetics and the origins of phonology. In
N. Burton-Roberts, P. Carr, and G. Docherty (eds.), Phonological knowledge:
conceptual and empirical issues, pp. 30539. Oxford University Press.
Wang, W. S.-Y. and Chen, C.-C. (1977). Implementation of phonological change: the
Shung-fng case. In W. S.-Y. Wang (ed.), The lexicon in phonological change, pp.
14858. The Hague: Mouton.
Wijnen, F., Krikhaar, E., and den Os, E. (1994). The (non)realization of unstressed
elements in childrens utterances: evidence for a rhythmic constraint. Journal of
Zamuner, T. S., Gerken, L., and Hammond, M. (2004). Phonotactic probabilities in
young childrens speech production. Journal of Child Language, 31, 51536.
Zamuner, T. S., Gerken, L., and Hammond, M. (2005). The acquisition of phonology
based on input: a closer look at the relation of cross-linguistic and child language
data. Lingua, 115, 140326.
Appendix A: Word production at the 15-word point
Aki (1;8)
/ai/ blue [aoi]
/tta/ there was [otta], [ta]
*
/bibai/ bye bye [bai]
*
/bs/ bus [ba]
*
, [ba]
*
/:o:/ buttery [owa]
/hi/ yes [hai], [ai]
/mp/ jump [ap]
*
/kee/ pretty [kije]
/koko/ here [koko]
/koe/ this [koe], [koe], [koje], [koi]
/m/ look [mi]
*
/ne/ (tag question) [ne]
/ok/ big [oki], [okki], [hoki], [oki]
/m/ horse [mo]
*
/io/ back [io]
*
Hiromi (1;3)
/atam/ head [apa]
*
/bi/ bye [bai]
/bo/ ball [bo]
*
, [o]
*
/hi/ yes [ai], [ai]
/hai/ bee [kpi]
/i/ good [i]
/ita/ there was [da:]
*
/koe/ this [], [k], [ke], [koe]
/mma/ mom [mama]
/ni/ not here [n]
/nnne/ sleep [nenne]
/ppai/ milk [owai]
/ppa/ dad [papa]
/wwa/ doggie [ww], [wawa], [ww], [w]
/jda/ no [dad]
Takeru (1;7)
/iai/ aye-aye (lemur) [aijai]
/ka/ red [ak]
/o/ blue [a]
/bni/ Barney [b]
*
, [b]
*
/bibai/ bye-bye [bawai], [babai]
/bou/ ball [bo]
*
/go/ plomp [go]
/kii/ giraffe [kigi]
/koe/ this [xe], [goe]
/ninai/ all-gone [nn]
/nj/ kitty [nam], [nam]
/oki / sweets [k]
*
, [ga]
*
/pi/ toss [wo], [po:]
/ta/ cheetah [cita], [ida]
/wn/ woof [w]
/wwa/ doggie [wwa], [ww]
/
s/ juice []
*
, []
*
Kenta (2;0)
/aita/ opened [a:da], [a:da], [a:te]
/o/ blue [a:]
/b:/ boo! [ba:]
/bibai/ bye-bye [ba:ba]
/bs / bus [ba
:], [ba]
*
/b
:b:/ car [byby], [bibi], [bu:bwi], [bu:b], [bb], [bbb]

/b
:b:/ piggie [b:b]

/hi/ yes [ha:]
/itiitai / hurt [djdj]
*
/k:ki / cake [ke:ki]
/mmma/ meal [ma:ma]
/m/ eye [ge:]
/nnda/ what [wa:da], [na:da], [nanda], [a:da], [na:da]
436 Mitsuhiko Ota
/ni/ 2 oclock [dii]
/njo/ kitty [da::]
/tonne/ tunnel [ninne]
*
, [nenne]
*
/z:sa/ elephant [dzdz]
Ryo (1;10)
/ame/ candy [ame], [mame]
/a/ foot/leg [ade]
/bnana/ banana [nana]
*
/b
:b/ car [b:b]

/dena/ train [dena], [deda]
/iigo/ strawberry [no]
*
, [o]
*
/kae/ frog [kae]
*
/ksa/ umbrella [kasa], [kaa]
/mma/ mom [mama]
/m/ eye [me]
/ni/ dont have [nai], [na]
/nni/ what [nani]
/n:a/ big sister [nena]
/nnne/ sleep [nenne], [nene], [ne]
*
, [ee]
/ohana/ ower [oana]
/pama/ pajamas [pama]
*
/p/ bread [pan], [pa]
/ppa/ dad [papa]
/aio/ lion [aion]
/i ka/ deer [ka]
*
/imama/ zebra [ima]
*
, [ma]
*
, [ma]
*
/t/ hand [te]
/wwa/ doggie [wawa]
Notes:
*
= truncated form.
= location of the pitch accent. Words without an accent mark are

unaccented.
= ingressive airow.
Appendix B: Tais word production at 1;6.04
Target and gloss Structure Tais production
/ka/ red (noun) LL1 [aka]
/aki/ red (adjective) LH2 [akai]
/akete/ open LLL0 [akete]
/a
i/ hot LH2 [ai]

/tta/ there was HL1 [atta]
/bnana/ banana LLL1 [n:a:ga:]
*
/:z/ cheese HL1 [tsi:j], [tsi:ji], [i:]
*
/
:/ kiss H1 [:]
/da/ (copula) L0 [da]
/d:zo/ please HL1 [do:zo]
/hi/ yes H1 [hai], [ai]
/hana/ nose LL0 [hana]
/hantai/ opposite HH0 [hantai]
/i k:ki / plane LHL2 [ik:ki], [k:]
*
, [ko]
*
/ho:o:/ kitchen knife HH0 [ho:o:], [ho:o:], [ho:to], [o:to], [o:do], [oto]
/i/ good H1 [i]
/ini/ not there LH2 [ini]
/io/ together HL0 [itto], [t:o]
*
, [d:o]
*
, [o]
*
/iti/ ouch LH2 [itai]
/ij/ no LL2 [ija]
/:a/ grandpa HH1 [i:a], [i:i]
/kkka/ mom HL1 [kakka]
/kakkoi/ cool HLH3 [kakkoi]
/ki kna/ train LHL2 [ka:ta]
*
/koko/ here LL0 [koko]
/konniiwa/ hello HLLL0 [koniwa] (acceptable informal form)
/ko:ka/ exchange HH0 [ko:ka:], [ko:ka]
/kot/ this way HL2 [koti]
/koe/ this LL0 [koe]
/mdoi/ green LLL1 [mi]
*
, [m:i]
*
/mo/ too L0 [mo]
/mimoi/ hello LLLL1 [mi]
*
/ni/ there isnt H1 [nai], [na:], [ne:]
/nna/ seven LL1 [nana]
/natta/ xed LHL2 [naotta]
/n:a/ older sister HH1 [ne:a], [ne:ta]
/oi:/ yummy HH2 [oii:], [oiji:], [o::]
/okaei/ welcome back LHL0 [ka:i:]
*
/p/ bread H1 [pa]
/pi/ toss H1 [poi]
/sakana/ sh LLL0 [nnaka], [naka]
*
/s k/ like LL2 [ski]
*
/teta/ came off LLL1 [doteta]
/toke:/ clock LH0 [ke:]
*
/ttto/ daddy HL1 [totto]
/yo/ (emphatic marker) L0 [yo]
/z:/ elephant H1 [zo:]
*
= truncated form.
= truncated form not included in statistical analysis because the target contains a
devoiced vowel or // whose omission can lead to ambiguous syllable counts.
= location of the pitch accent. Words without an accent mark are unaccented.
438 Mitsuhiko Ota
Part IV
Perspectives and challenges
16 Aview from developmental psychology
Lorraine McCune
In developing spoken language children integrate their ability to vocalize with
the experience of meaning. Language typically begins with a period of at least
several months where children express meanings with one word at a time. It is
therefore not surprising that detailed analysis of childrens word production
strategies has yielded the information that their production process, level of
phonetic skill, and phonological development are organized at the word level.
In this chapter I will rst address the critical importance of template research.
I then consider the relationship between research in early child phonology and
more general studies of the rst phase of language acquisition, and the approach
to the integration of these diverse ideas that Marilyn Vihman and I have
developed. Next is a section on use of the term representation and the manner
in which mental representation and entry into language interact in development.
I then briey review a physiologically based theory regarding linguistic repre-
sentation. Finally, I propose a dynamic systems view of the transition into
language.
The importance of the template research
The early case studies exposing individual childrens idiosyncratic patterns of
single word production (e.g., Priestly 1977; Macken 1978) did not immediately
lead to the hypothesis that such personal shaping of word productions is a phase
of typical development. That hypothesis must now be seriously entertained.
From a developmental perspective the recognition that adopting one or more
word production templates may be a typical step in the acquisition of language
for most children (Vihman and Croft 2007) is a critical discovery because it
potentially adds to the known developmental sequence of vocal behaviors
characterizing the transition to language. There was a time when babbling
was considered unrelated to speech (Jakobson, 1941/1968). Resolution of that
issue has allowed researchers to recognize prelinguistic vocalizations as inu-
ential, and to use them as a resource in plotting the childs path to language.
Recognizing individual complex and consistent motor production patterns that
are closely related across words for many children, yet differ by child, portends
an even more radical turn in our approach to childrens transition to language
than earlier recognition of the importance of babbling.
441
Vihman and Croft (2007), citing Vihman and Velleman (2000), describe
templates as follows: specic phonological patterns which t many of the
words that the child attempts (these words are said to be selected), but which are
also extended to words that are less close to the template (these words are then
adapted to t the template) (p. 690). They further describe the process of
recognizing the inuence of a template on a childs productions:
Three types of clues are generally used to identify a childs word template(s):
(a) Consistency of patterning in a substantial number of the child forms for words
produced in one or more recording sessions or over a period of some weeks or
months;
(b) The occurrence of unusual phonological correspondences between adult and child
forms (i.e., rules or processes or repairs to target word violations of child
constraints), under the inuence of a dominating pattern or template;
(c) Frequently, a sharp increase in words attempted that either t or can be tted into the
pattern. (p. 693)
The schematic pattern that is used to describe a childs template in this volume
and elsewhere describes the shapes that can be encompassed by the childs
phonetically interrelated word productions. In some sense this can be seen as a
basic phonetic score for expressing meaning. In a given case of communicative
production I envision a child as experiencing simultaneously the intention
to communicate, an internal sense of meaning, and the activation of motor
procedures. For a certain period of development the motor processes seem to be
dominated by a consistent motor framework that nevertheless is modied by
characteristics of individual words. In my view the development of one or more
templates provides that consistent motor framework. Template patterns ordina-
rily emerge from earlier processes such as the Vocal Motor Schemes (VMS)
initially described by McCune and Vihman (1987, 2001). The crosslinguistic
examples in this volume show template development and use to be a dynamic
process integrating the childs ongoing phonetic development with co-
occurring development of word meanings.
Phonology and child language study: relationships over time
As Vihman and Croft (2007) mention, Ferguson and Farwell (1975) reported
some surprises in their ndings that pointed ahead to the approach to phono-
logical development explored in this volume: One inconsistency is the
existence of a high level of variation of word forms. The range of variability
plus certain regular forms of variation together make it difcult to make state-
ments about either phonological contrasts or unique underlying forms and
systematic rules, so that traditional forms of phonological analysis are not
strictly applicable (p. 22). Additional surprising aspects included recogni-
tion of greater accuracy in earlier productions that later showed reduction in
accuracy, and the surprising selectivity in the words individual children chose
to produce. With detailed examination of individual children over the ensuing
442 Lorraine McCune
years it became clear that phonological development was nonlinear and not
easily explained by existing models. Meanwhile many researchers in the
broader eld of child language remained unaware of this news for some time.
Ferguson and Farwell were writing in the midst of a strong resurgence of
interest in the single-word period. The work of early diarists had formed the
basis for major theoretical proposals regarding the joint emergence of sound and
meaning in children acquiring spoken language (e.g., Werner and Kaplan 1963;
Piaget 1962). Diary studies had necessarily been limited in their use of phonetic
analysis, although some tried to capture childrens phonetic forms on the y,
with results that were useful to these theoretical proposals. Surprisingly, in the
1970s, when audio and video recording allowed detailed transcription of infant
utterances, the major longitudinal studies of the single-word period undertaken
by psychologists, and even some linguists, had very little to say regarding the
phonetics or phonology of single-word speech. It was only researchers devoted
to phonological development who delved deeply into the specic phonetic
forms children produced in relation to intended language targets. At the same
time, the major theoretical proposals of Piaget and Werner and Kaplan were
mentioned respectfully but not fully integrated into research paradigms address-
ing early transitions in language.
Bloom and Lahey (1978), who summarized and interpreted the develop-
mental work of this period, identied the interaction of form, content and
function as critical to both language acquisition and our understanding of
the process of acquisition. Yet early phonetic development (surely a critical
aspect of form) could not be integrated into their work, because of the
isolation of studies of phonological development from other work on acquis-
ition. Bloom and Lahey remark: Virtually all of the studies of phonological
development before the 1970s took isolated speech sounds as the unit for
analysis, and tested children in the age range of three to eight years (1978:
100). Reviewing studies of infant vocal production, they report the growing
amount of evidence about the parameters of the sounds that infants make
along with a consensus that there is little overlap between the sounds made in
infancy and the sounds of early speech (p. 88; a conclusion that was already
being disputed in the eld of child phonology). In a prescient section devoted
to phonological development, they reviewed the work of several authors
included in the current volume (e.g., Ferguson and Farwell 1975; Waterson
1971), recognizing the rising interest of the word as a unit of analysis for early
phonological development. For example, they cite Watersons (1971) view
that children are able to capture perceptual features from the words they hear
and derive structural information or schemata to guide their own productions.
Similarly, they reported Ingrams (1974) recognition of child-specic produc-
tion tendencies leading to word productions that differ from typical adult
productions. But this information is isolated in the volume, not integrated
with the developmental themes explored.
Aview from developmental psychology 443
In her 1993 monograph Bloom provided an updated review of child phonol-
ogy research, including the recognition of continuity between babbling and
speech. However, even this excellent volume maintained the division between
phonological studies of early language and the study reported in the mono-
graph. The phonological data have no recognizable effects on the questions
asked or the analyses performed in the 1993 study. Childrens early words were
assumed to have both relevance and a consistent form that make them recog-
nizable by parents and interested others as conventional words in the language
(p. 82, emphasis added). The fact that childrens forms differed from those
typical of adults was recognized, but the differences were expected not to be too
great and to be consistent for a given child. There has rarely been recognition
outside of the phonological literature that childrens several productions of a
given lexical item might vary in phonetic form, or that such variation is of
interest. Phonetic and phonological development was largely omitted from
these studies. Current emphasis on parent report for language assessment has
only exacerbated this problem. There is a great need for the integration of
phonological analysis within the broader acquisition literature. If motor organ-
ization for production is a critical acquisition, the omission of phonetic and
phonological analysis seriously limits the validity of theorizing.
In language research by developmental psychologists the kinds of word
meanings learned, their consistency across children, and rates of development,
were among the primary issues of interest (e.g., Bates, Benigni, Camaioni, and
Volterra 1979; Bloom 1973; McCune-Nicolich 1981; Nelson 1973). The devel-
opment of mental representation was considered a shift in cognitive ability
from earlier perceptual or sensory motor processes that was important for
language acquisition, and some studies targeted relationships between non-
linguistic measures, such as play, with language. But little attention was given
to a possible internal represented form of specic adult words in relation to a
childs productions, which might both differ from the adults form and vary
among themselves. For researchers in child phonology this was and remains a
critical issue.
For example, Menn (1983) presented what she considered a minimal de-
nition of a lexicon that includes attention to form: it at least denotes a
collection of stored, accessible, memorized bits of information about the sounds
and meanings of words and/or their component meaningful parts (p. 8). In
contrast, child development studies equate the lexicon with a vocabulary list!
To oversimplify, studies of phonological development aim (among other goals)
to determine what a child comes to know that allows (a) comprehension of
specic sound sequences as language and (b) production of sufciently closely
matching sequences to be recognizable as such by adults who speak the
language. The question of how the child comes to this knowledge is also of
critical interest. Menns denition of a lexicon (above) reects information
processing theory, but child phonology has utilized a variety of theoretical
models over the decades, as summarized by Vihman (forthcoming 2014).
444 Lorraine McCune
Integrating child phonology into general studies of language
acquisition
My initial studies were part of the urry of developmental data collection with
language acquisition goals that began in the 1970s, taking no account of
phonology (e.g., McCune-Nicolich 1981). A major early goal of my work
was to explore the relationship between the development of representational
play and the development of language, under the assumption that the capacity
for mental representation was an underlying development affecting various
domains of behavior. I found that children showed hypothesized language
levels only when they also exhibited the appropriate hypothesized play levels.
However, a number of children showed delay between the play and language
achievements. I suspected limitations in their ability to produce speech, but had
no idea about how to address this issue.
Collaboration with Marilyn Vihman made it possible to integrate phonolog-
ical considerations into the study of the transition into language in our joint
work (Vihman and McCune 1994; McCune and Vihman 2001), in contrast with
my earlier, more semantically based studies (and her earlier phonologically
based studies). The joint work allowed us to bridge the gap between earlier
psychological and phonological approaches. The dynamic systems approach to
language recognizes that a number of underlying developmental variables,
each with its own trajectory, contribute to a childs shift from prelinguistic
status to becoming a language user (McCune 2008; Thelen 1991). Along with
phonetic skill, the ability to represent internal meaning with external symbols
(e.g., words, signs, or play acts) is a primary requirement for the transition to
language. The childs capacity for mental representation (symbolic ability)
develops over time through interaction with objects and people in the environ-
ment, in the context of cognitive and maturational processes.
Werner and Kaplans (1963) Symbol Formation was at once ahead of and
behind its time. The theory boldly addressed the need to derive something from
nothing: to start with a child who knows nothing of language, not even that
language exists, and show how development might proceed from the earliest
sounds and bodily movements, in a human social environment, to the produc-
tion of sentences. Theirs is essentially a cognitive and embodied model written
with the grace of a philosophical treatise. It entered an intellectual world
dominated by the perspective of behaviorist thinking, soon to be overtaken by
the Chomsky revolution. This book was revered by developmental students
of child language, but its message was never integrated into mainstream
research. It remains a rich theoretical source for understanding how children
come to language. Research developments over the past half-century have only
increased its relevance. In what follows I will use their model to sketch an
interpretation of how the developments in child phonology included in this
volume allow a more complete theoretical view of how children come to
language.
What sort of representation?
Mental representation, in the sense of Symbol Formation, refers to the relation-
ship between one element, dened as a symbol, and another element, the
symbolized, which the symbol is said to represent. Simply put, for example, a
word can be considered to symbolize an underlying meaning. Mental represen-
tation from this perspective is a contentful state of consciousness (Searle 1992)
rather than a neural code. This sort of relationship was the primary psychological
meaning of the term representation prior to broad adoption of the computer
metaphor of mind. A neurological basis was assumed, but the recognized lack
of knowledge regarding brain physiology and processes prevented theorists from
proposing models of these relationships. More recently, computer simulations
and brain-imaging techniques have emboldened theorists to model possible
neurological relationships underlying behavior. The language of neurophysio-
logical studies has been adopted in simulations, leading to a confusing supercial
similarity with physiologically based neurological research.
Underlying linguistic representation of a more abstract sort was proposed at
least as early as Chomsky (1965). Gradually the idea that brain and behavioral
processes were guided by the interactions, and perhaps computations, of under-
lying, in some sense physiologically based representations became the entrenched
assumption of many cognitive and linguistic approaches. Emphasis on the role of
such representation in the acquisition process was a logical next step, given this
view of adult language. In contrast, psychological studies of development tended
to begin with descriptions of child behavior and attempted to infer from them
interpretations of meaning and/or communicative goal. The current volume
strongly emphasizes approaches to understanding the relationship between the
underlying or represented formof an adult word that children hear and the forms
that they produce. The path between the two is puzzling. In what way might the
adult input form be represented by the child? In what way might this represented
formprovide a basis for the childs very different production? In my view, the lack
of detailed knowledge regarding the neurological mechanisms involved in com-
prehending and producing language limits our capacity to answer these questions.
Instead I suggest that the more detailed exploration of the motor aspects of speech
production implicated by the template ndings, along with close attention to the
emerging evidence of embodied aspects of meaning, may bring us closer to a
physiologically based understanding of language acquisition. In the following
I address these issues more fully.
Menn, Schmidt, and Nicholas (this volume), proposing an essentially cogni-
tive model of phonological development, address the ambiguity in the use of the
term when discussing the representation of a word: An under-acknowledged
problem in linguistics, and even in psycholinguistics, is that we use the term
representation as in mental representation, underlying representation,
surface representation, semantic representation, etc. without discussing the
concept of representation itself (pp. 4689). Menn et al. proceed to clarify
446 Lorraine McCune
usage within their model and discuss the broad range of information that must
be included in the representation of a word. The focus is entirely on assumed
neurological representation rather than mental representation as a conscious
state. The Menn et al. Linked-Attractor model aims to integrate the best of
current phonological representation theories into a workable whole, in line with
the available data. At the same time Menn and her co-authors acknowledge that
the linked attractor model is not appropriate for predictions at least until it can
be modeled on a computer (p. 485), suggesting that the basis of such analyses
is as much in computational modeling as in analyzing child data. There is some
controversy within philosophy of science between model-builders and other
philosophers, who suggest that proposed cognitive/neurological processing
theories must go beyond models, and be physiologically demonstrable in
order to prove their value. For example, Stich (1992) states that to be exploited
in a respectable scientic theory a concept must be naturalizable (p. 258), that
is, able to be described at a physiological level.
Menn et al. (this volume) recognize that current models are schematic
compared to the real level of explanation the level of patterns of neural
activity . . . but were not even close to being able to get data at that level, or
interpreting them if we had them (p. 495). The daunting distance between
research on the behavioral phenomena discussed in this volume and the neuro-
logical bases of development is partly attributable to the wide division between
behavioral and neurological investigation. Rapid development in neurological
study over the past decade or so could provide the basis for more empirically
rooted neurological theorizing about language acquisition. However, neuro-
logical investigations fail to take note of developmental ndings in phonology, a
barrier parallel to developmental linguists and psychologists lack of familiar-
ity with progress in neurological work. An additional disconnect is between
models of brain development based in cognitive psychology (most emphasized
in the phonological literature) and more physiologically based models where
research relevant to but lacking a connection with phonological development is
advancing.
The development of more biologically oriented models from the study of
embodied cognition (e.g., Johnson 1987, 2008) and those based on neuro-
logical research (e.g., Hickok and Poeppel 2004) provide additional relevant
perspectives on what physiological processes may contribute to language
comprehension and production. In neurological studies the term representa-
tion refers to physical brain locations at various levels of detail (from individ-
ual neuron to functional area) that have demonstrable physiological relations
with bodily elements or processes. Kent (2007) addressed the anatomic, motor,
and sensory foundations of speech development in children, suggesting that a
fruitful direction for theory and research may be found in the study of mirror
neurons (Gallese, Fadiga, Fogassi, and Rizzolatti 1996). Considering physio-
logical and neurological information in relation to language acquisition in
general, and childrens word productions specically, should enhance our
approaches to understanding developmental trajectories and individual differ-
ences across children. Since the initial description of the mirror neuron system
in monkeys (Rizzolatti et al. 1988; di Pellegrino et al. 1992) there has been
sufcient progress to demonstrate the high likelihood of such a system in
humans (Rizzolatti and Craighero 2004). This has allowed the beginning of
theorizing regarding mirror neurons in infancy (Del Giudice, Manera, and
Keysers 2009) and even research demonstrating the development of such a
system in human infants by 6 months of age (e.g., Lepage and Thoret 2006;
Marshall, Young, and Meltzoff 2010; Nystrm 2008). This line of study may be
of critical importance in addressing a wide range of issues in child language
research, but it is beyond the scope of the current chapter.
In my view, both physiologically based neurological representation and
mental representation as a conscious experience of meaning are essential
aspects of language.
Development of mental representation and early language
The capacity for mental representation dened as a conscious contentful state
develops over the second year of life and is assumed to underlie different forms
of language and play (McCune 1995). Context-limited words (termed proto-
words by Menn 1983) should be distinguished from referential words
(McCune 2008). The former occur embedded within situations at the earliest
levels of representational play (contemporaneous with the transition to mental
representation, as assessed in object permanence tasks: Piaget 1962, Ramsay
and Campos 1978). Play at this period involves simple schemes limited in
application to the childs own body (e.g., putting a cup or toy bottle to the
lips, or brush to the hair; McCune 1995). A child with phonetic skill, for
example having VMS [p/b] may, with parental assistance, learn a word such
as bye-bye, always accompanied by a hand wave and often occurring when
people are departing. Context and word are inextricably linked and production
is often based on a Vocal Motor Scheme (VMS). Some initially context-limited
words may be extended to referential use when the child develops referential
capacity; others will drop out of the repertoire, or continue to be used in limited
but appropriate circumstances.
Werner and Kaplan, from detailed diary studies, traced the emergence of
clearly referential words along a pathway beginning with sounds produced
automatically in certain circumstances, such as the sounds of eating. They
report that both Hildegard Leopold and a child studied by Lewis (1936) derived
their initial vocables, as Werner and Kaplan term the childs initial words, from
eating sounds (Hildegarde: [m]; Aments niece, studied by Lewis: [mammam]).
These referred to a large sphere of events related to food-getting and food
eating (Werner and Kaplan 1963: 111). In this example, rather than being
highly restricted, the words application is diffuse yet context-embedded.
In theory, the child rst experiences the sounds generated while ingesting
448 Lorraine McCune
food, then begins to be reminded of these sounds in contexts related to eating.
This leads to production somewhat separate from the actual eating that origi-
nally generated the sounds. For Amants niece the vocable mammam began as a
context-limited word related to eating and food but was gradually shaped in
interaction with the ambient language.
Werner and Kaplan proposed that children form their initial symbols in the
process of gradually developing the ability to represent (i.e., consciously
experience) situations internally, in the absence of perceptual support.
Grounding in bodily experience is seen as critical. An external symbol, such
as a word form, is constructed in relation to the underlying meaning that it
comes to express. They used the diary examples to demonstrate that the earliest
vocables may begin in relation to a sound that is indistinguishable froma natural
activity. In both of the cases cited by Werner and Kaplan, as additional words
were learned, the reference of the initial vocable became restricted. Aments
niece initially (at 354 days of age) used mammamin reference to her mother and
sister as well as to bread, cakes, and cooked dishes. By 597 days of age clearly
referential words were produced: mammam was delimited to cooked dishes,
while bread and cakes were brodi and her mother was mama, her sister desi.
Werner and Kaplan attribute these developments to an underlying process of
differentiation. As the child differentiates the sound produced while eating,
producing it in the absence of actual ingestion, ongoing experience of the
language accompanying various events of importance to the child facilitates
development of additional meanings in relation to other adult words. As new
meanings come to be used, the original vocalization becomes restricted in its
meaning and use. Then, having established the potential for meaningful vocal-
ization outside the natural source context, the child begins to differentiate the
wider variety of meanings expressed with various vocal forms by speakers in
her environment. A parallel process of integration characterizes the relationship
between internal meanings and the words that come to represent them.
As a result of the transition to referential language the child begins extending
word meanings beyond the original context where they were learned, as well as
differentiating meanings that were overly broad (e.g., the changes in application
of mammam, above). Werner and Kaplan do not specify the referential tran-
sition by name, but it appears in the developmental trajectory of their examples.
These theorists did not consider the formmeaning link to be arbitrary. Rather,
word form and meaning are co-constructed through a process termed dynamic
schematizing. The word form (symbolic vehicle) and meaning (symbolized)
remain related at a neurological level through this developmental process.
Dynamic schematizing is dened as the process that allows the differentiation
of varied aspects of meaning and form, for example, supporting the transition
from use of the original form mammam in varied contexts to the use of separate
forms in relation to each of these contexts (e.g., brodi for cooked dishes and
mama for the childs mother). As forms become differentiated, each becomes
more fully integrated with its internal meaning. Children experience the words
of the language from adults, but then construct both form and meaning through
their own internal processes. (The discovery of template-based production
patterns provides a window into this interactive process.) Because of the joint
developmental history of sound and meaning, hearing a word instantiates an
internal contentful state of meaning. For a child who has achieved the capacity
for referential language, various external circumstances related to that same
state for the child (but perhaps not for adults) may call the word to mind,
presumably its phonetic potential as well as its meaning, leading to production.
The fact that children generalize their productions beyond those expected by
adults testies to an internal constructive process. While such processes must
have a biological basis, this was not discussed in early works (e.g., Piaget 1962;
Sartre 1948, or Werner and Kaplan 1963). Children also demonstrate a capacity
for mental representation outside language by the time these developments
occur (e.g., McCune 1995).
Citing Werners (1957) theme of the syncretic nature of the young childs
experience, Tucker (2002) supports Werner and Kaplans claim that linguistic
representations are non-arbitrary (p. 67). Although we may not be able to
apprehend the fact in experience, the neural architecture teaches that the mean-
ing of language is multileveled[,] from the gut level that is inherently subjective
to the surface articulation that is communicable within the articulatory conven-
tions of the culture (p. 68).
A current neurological model
Hickok and Poeppel (2004) developed a large-scale model of language inte-
grating data from neuropsychology, neuroimaging, and psycholinguistics,
drawing on relatively recent analyses of the cortical organization of vision
to guide this new framework. They propose a bilateral ventral stream of neuro-
logical activity that integrates the acoustic and semantic aspects of language
and a simultaneously active bilateral dorsal stream integrating acoustic and
motor aspects of language. They see these systems as differentiated, but prob-
ably interacting. This model would support the Werner and Kaplan view.
The vehicle/meaning relationship is considered non-arbitrary, but not in the
sense that similar vocal forms should share meaning across languages. Rather,
meaning and form are co-constructed, within the individual, with (in contem-
porary terms) mutual neurological activation in relevant brain structures. Within
the eld of child phonology there is clear recognition that a childs production of
the same word varies both within a given session and across time (e.g., Ferguson
and Farwell 1975; Macken 1978; Waterson 1971), yet eventually child
word productions match those of the ambient language. Recognition of this
developmental course supports a construction process, as does the existence
of child-specic templates.
Differential neurological response to motor-specic language lends credence
to the non-arbitrary aspect of meaning. Recent research with adults has found
450 Lorraine McCune
differentiated neurological activation while participants listened to action sen-
tences and verbs. Pulvermller (2002) reported differential EEG activation at
dorsal sites closer to the cortical leg area for listening to verbs such as walking
versus stronger activation at inferior sites next to motor representation of the
hand and mouth for verbs such as talking. Tettamanti et al. (2005) found that
cortical areas that were active during action observation (mirror neuron areas)
also showed differential activation during listening to sentences describing
actions by mouth, hand/arm, or leg. Buccino et al. (2005) found motor-
evoked-potential (MEP) changes specic to the hand or foot neuromotor area
in response to action sentences describing limb actions. Results were specic
for the effector involved in the action sentence heard, and listening to abstract
control sentences had no effect in either study.
These adult effects might be learned associations going from language to
motor activation, or they may support the notion of deep bodily construction
processes in the development of meaning/vocalization relationships that have
left a lasting mark. It is also possible that such peripheral neurological reactions
occur in the moment as we comprehend or produce language and constitute an
aspect of the language meanings we experience.
Language as a dynamic system: the inuence of vocal variables
Language development is a highly complex process best understood as the
emergence of a dynamic system. As conceived by Thelen and colleagues
(e.g., Thelen 1989, 1991; Thelen and Smith 1994, 2007), a dynamic systems
approach to development entails pervasive interaction between the organism
and the environment and the possibly asymmetrical development of subsidiary
systems within the organism. (The earlier development of representational play
in comparison with language milestones exemplies this (e.g., McCune 1995).)
The development of vocal production ability is both a process in the path toward
language and the product of additional underlying developments. McCune
(1992, 2008), relying on joint work with Marilyn Vihman (e.g., McCune and
Vihman 1987, 2001; Vihman and McCune 1994), demonstrated that the assess-
ment of a number of underlying variables, including both vocal development
and the development of mental representation, interacting in a dynamic system,
could predict the timing of childrens shift to referential language.
The McCune model for a dynamic systems view of the transition into
language begins with a basis in the development of mental representation as
an underlying variable that affects the observable behaviors included in the
model. The model assumes a positive social/emotional relationship with one or
more adults in the language community. The two nonvocal skills included are
sensorimotor cognition as dened by Piaget (e.g., Piaget and Inhelder 1969)
and assessed by measures of object permanence and representational play, as
described above. The vocal components of the model were operationalized
following some initial exploratory investigation of data, and then veried on
additional data. Two vocal variables were found to be predictive of the refer-
ential shift in studies of McCunes and Vihmans combined sample of 20
children between 9 and 16 months of age. The rst, Vocal Motor Scheme
(VMS) production, is a measure of vocal production skill that McCune and
Vihman (1987) related to the transition to reference. It is assessed by frequent
and consistent use of one or more specic supraglottal consonants.
The term Vocal Motor Scheme has its origin in Piagetian sensorimotor
cognition, where skill with a particular movement is termed a scheme. For
example, Thelen, Corbetta, and Spencer (1996) demonstrated that 6-month-old
childrens successive reaches toward an object showed random variation in
trajectory, while by 8 months each child showed a relatively consistent trajec-
tory in repeated reaches, achieving a reaching scheme which could vary with
reference to distance and target characteristics. Analogously, repeated accurate
production of the motor action yielding a given supraglottal consonant, or some
other vocal target, is considered a Vocal Motor Scheme. Children who made the
transition to referential word production by 16 months, the nal month of the
study, all showed VMS-level competence with at least two supraglottal con-
sonants by the time of that transition. Of the words used at both 15 and 16
months, on average 90 percent incorporated each childs specic VMS reper-
toire (McCune and Vihman 2001). Vihmans continuing studies have more fully
established the value of this variable (e.g., Keren-Portnoy, Vihman, DePaolis,
Whitaker, and Williams 2010; DePaolis, Vihman, and Keren-Portnoy 2011;
DePaolis, Vihman, and Nakai in press).
Vocalization, like all behavior, has its basis in the neurobiology of the organ-
ism. Tucker (2002) reported that the earlier myelination of primary sensory and
motor cortices in comparison to other brain areas provides the basis for the vocal
control of articulation needed for babble. Thus basic production processes
stabilize early. This would account for a childs frequent production of what
has sometimes been called a favorite sound (Ferguson 1978; in our terms,
VMS). Neurological and motor developments contribute to the consistency of
production that provides coherent feedback to the child herself and leads to adult
recognition of the sounds recurrence. More cognitively directed brain regions are
slower in myelination, providing ongoing opportunity for developing meanings
in relation to more complex sequences of articulatory gesture. Tucker suggests
that the retention of juvenile plasticity in limbic cortices may be integral to the
exibility of human adult cognition, allowing adults to learn the meanings of the
words in a new language, even though to native speakers, they remain only
marginally competent with the sensorimotor articulation of those words (p. 74).
The VMS measure relies on the motoric stability described by Tucker. The
development of a word production template is a more complex matter and may
await additional developments, as a template by denition involves word learning
and so the interaction of sound and meaning.
In our initial report (McCune and Vihman 1987) we noted complex patterns
evolving from initial VMS skill that affected the word shapes the children
452 Lorraine McCune
produced. Recognition that this more complex development incorporated
phonetic aspects of words in the ambient language as well as the childs own
phonetic tendencies suggested that word recipes or word production pat-
terns (Vihman and Velleman 1989), now termed templates, reect distinct
processes. However, there is typically continuity in motor development
between the two (Vihman, Velleman, and McCune 1994). This volume clearly
demonstrates the theoretical importance of the template form, but understanding
the developmental role of this process in childrens language acquisition will
require additional study.
The second vocal variable implicated in the shift to reference is the commu-
nicative grunt (McCune, Vihman, Roug-Hellichius, Delery, and Gogate 1996).
Two important ndings linked communicative grunts with the transition to
reference (McCune et al. 1996). First, we found that referential word produc-
tion, for the early talkers studied, and referential word comprehension for the
later talkers, were rst observed either in the same monthly session as the onset
of communicative grunt use or in the following session. Second, the early
talkers all showed sharp increases in word production following only limited
use of context-dependent words in earlier sessions. Both earlier and later talkers
more than doubled communicative events (including gesture) at the time they
began communicative grunts (McCune 2008).
Grunts occur autonomically following reexive laryngeal closure under
conditions of effort or physiological stress. Such reex closure, across mamma-
lian species, tends to increase oxygenation to the blood and restore homeostasis,
or facilitate ongoing effortful activity. McCune et al. (1996) found that grunts
rst co-occurred with physical effort, then with focused attention, before shift-
ing to communicative use. To account for the temporal linkage between com-
municative grunt use and the shift to reference we reasoned that childrens
experience of their own grunt under conditions of effort or attention (internal
meaningful states) might prompt an initial recognition of sound/meaning link-
age, leading to increased attention to the meanings available in their linguistic
environment.
In summary, this dynamic systems model predicts that children will make the
transition to referential language production only when (1) mental representa-
tion reaches the level shown in play by combining pretend acts, (2) phonetic
development reaches a critical point (dened in early talkers by identication of
at least two VMS), (3) communicative intent comes to be realized by production
of the natural vocalization dened as a communicative grunt. The children
studied all showed communicative gestures earlier than communicative grunts,
but we could not determine whether this is an essential variable in the model.
Children lacking the phonetic skill indexed by two VMS but exhibiting the
requisite communicative and representational skills showed referential compre-
hension by gesture, in the absence of word production. The variables identied
above are all indices of underlying abilities and so might be assessed in
other ways. The three vocal variables I have emphasized in this chapter,
VMS, communicative grunts, and word templates, share the property that each
facilitates some aspect of language development over time. In addition they all
contribute in the moment to facilitating communicative production. This dual
behavioral and developmental role is typical of variables within a dynamic
system.
Automaticity and language production
Although we discuss motor planning for speech, such planning must occur
simultaneously with production, or disuency results. Vocal communication
reects automaticity between intention and expression. Vocal Motor Schemes,
communicative grunts and word templates all contribute to the automaticity of
production during the process of language development. VMS consonants
dominate childrens early words, no doubt due to ease of production (partly as
a result of practice and familiarity). Word templates provide formats that can
be shaped in word production to integrate child phonetic capacity with ambient
language word shape. Communicative grunts in infancy may be a by-product of
the automaticity required in vocal production.
In adults increased motor activity in the laryngeal muscles is observed
immediately before vocalization, suggesting that laryngeal activation (the
basis of communicative grunts) may be an automatic response to the intention
to vocalize (Buchtal and Faaborg-Anderson 1964; Kirchner 1987; see also
Esling 2012, on infant laryngeal initiation). Communicative grunts, which
continue with some frequency throughout the single-word period, may be
stand-in vocalizations for missing or slow-to-be-recalled words.
Consider the adult experience of searching for a word. This search may not be
silent, as the speaker lls in with sounds such as eh or um (Goffman 1978),
suggesting continuity with infant grunts, which take a similar form. These
vocalizations also serve as pragmatic or phonetic devices to maintain the
conversational rhythm. Ward (2004, 2006) terms these vocal expressions con-
versational grunts. He reports their broad use in English and other languages,
with some correspondence between phonetic aspects of such expressions and
their meaning in context. These expressions seem to be continuous with infant
communicative grunts.
The critical advantages of automaticity in speech production may be the basis
for early reliance on VMS consonants and the development of production
templates in young children. Awell-practiced phonetic repertoire should impact
directly on the transition to referential language use because as the child
experiences the intention to communicate, the internal meaningful experience
may nd expression only through fairly routinized motor activity. The childs
internal idea (or meaning) is essentially clothed with sounds and words in the
process of its formation. Having basic motoric potential at the ready when the
communicative intention is experienced must be an essential feature of
454 Lorraine McCune
communicative speech. A communicative intent, absent motor potential, may
result in a communicative grunt.
There may not be sufcient understanding of the speech-motor develop-
ment underlying template-based word production to suggest a coherent theory
of how the capacity for reference might be integrated with this important
development. Progress in understanding the source, function, and develop-
mental trajectory of word templates depends, at least in part, on determining
their phonetic bases in individual children. Can production commonalities be
identied across many childrens template formats that might suggest whether
and how the templates increase production ease and thus automaticity? While
based in the individual childs phonology, during the instantaneous process of
producing a word it is also possible that echoic inuences from adult words
heard in given contexts might affect the formation of the specic template-
based word in its production. Throughout the single-word period word pro-
duction would be affected by these various memory- and context-based
inuences.
The next phase in language acquisition is the transition to word combina-
tions. I found sharp acceleration in multiword production following a shift to
more advanced symbolic play (McCune 1995), but have not developed a
dynamic systems model of this next phase. There is minimal research evidence
on the relationship between template-based word production and the transition
to combinations. Yet the templates begin to occur just as word production is
rapidly increasing and word combinations are imminent. It would seem that the
attraction to automatic template-based production would need to be either
eliminated or incorporated into uent multiword production. The fact that
there is a two-word stage in childrens development toward grammar opens
the possibility that template effects might be seen in presyntactic combinations,
forming part of a dynamic system for this next transition. (See Donahue 1986
and Matthei 1989 for case studies demonstrating possible template effects in
this transition.)
Despite great progress in understanding neurological development and
functioning, and despite useful models of these processes, we do not actually
know, beyond our metaphors, how a word is produced or how it is recognized in
perception, although progress is being made from various directions. Except in
cases of recording individual brain cell activity, and brain-imaging techniques
identifying general areas of activation under specic conditions, the term repre-
sentation exists only in a metaphorical context. Phonologists understanding of
representation uses the metaphor of the brain as an information-processing
device. Psychologists ideas about mental representation depend upon unveri-
able states of consciousness that must also have some basis in neurophysiology.
The very limitations of our biological knowledge allow us freedom to conceptu-
alize different brain/behavior relationships. Perhaps words are based on relatively
stable internal representations, as some authors in the current volume assume,
or on more dynamic processes where such stability is lacking. It may be the case
that the underlying neurological representation of a word exists only at the
moment of perception or production, emerging as a result of the particular task.
Physiological evidence that might specify the nature of our ongoing linguistic
knowledge is lacking. The studies of differential motor activation in response to
word meanings (mentioned above) suggest that the body itself participates in such
representation, as predicted by Varela, Thompson, and Rosch (1991) for neuro-
logical representation in general. Research is needed that combines relevant
aspects of phonological development with neurological study. Beyond this empir-
ical goal, theoretical integration is needed which will allow researchers from
various elds of endeavor to engage in the same task.
References
Bates, E., Benigni, L., Bretherton, I., Camaioni, L., and Volterra, V. (eds.) (1979). The
emergence of symbols: cognition and communication in infancy. New York: Wiley.
Bloom, L. (1993). The transition from infancy to language. New York: Cambridge
University Press.
Bloom, L. and Lahey, M. (1978). Language development and language disorders. New
York: Wiley.
Buccino, G., Riggio, T., Melli, G., Binkofski, F., Gallese, V., and Rizzolatti, G. (2005).
Listening to action-related sentences modulates the activity of the motor system: a
combined TMS and behavioral study. Cognitive Brain Research, 24, 35563.
Buchtal, F. and Faaborg-Anderson, K. L. (1964). Electromyography of laryngeal and
respiratory muscles. Annals of Otology, Rhinology and Laryngology, 73, 18121.
Chomsky, N. (1965) Aspects of a theory of syntax. Cambridge, MA: MIT Press.
Del Giudice, M., Manera, V., and Keysers, C. (2009). Programmed to learn: the ontogeny
of mirror neurons. Developmental Science, 12, 35063.
DePaolis, R., Vihman, M. M., and Keren-Portnoy, T. (2011). Do production patterns
inuence the processing of speech in prelinguistic infants? Infant Behavior and
Development, 34, 590601.
DePaolis, R., Vihman, M. M., and Nakai, S. (In press). The inuence of babbling patterns
on the processing of speech. Infant Behavior and Development.
di Pellegrino, G., Fadiga, L., Fogassi, L., Gallese, V., and Rizzolatti, G. (1992).
Understanding motor events. Experimental Brain Research, 91, 17680.
Donahue, M. L. (1986). Phonological constraints on the emergence of two-word utter-
ances. Journal of Child Language, 13, 20918.
Esling, J. (2012). Articulatory function of the larynx and the origins of speech. Plenary
paper presented at the 38th Annual Meeting of the Berkeley Linguistics Society.
Ferguson, C. A. (1978). Learning to pronounce: the earliest stages of phonological
development in the child. In F. D. Minie and L. L. Lloyd (eds.), Communicative
and cognitive abilities early behavioral assessment, pp. 27397. Baltimore:
University Park Press.
Ferguson, C. A. and Farwell, C. B. (1975). Words and sounds in early language
acquisition. Language, 51, 41939. Reprinted in this volume as Chapter 4.
Gallese, V., Fadiga, L., Fogassi, L., and Rizzolatti, G. (1996). Action recognition in the
premotor cortex. Brain, 119, 593609.
Goffman, E. (1978). Response cries. Language, 54, 787815.
456 Lorraine McCune
Hickok, G. and Poeppel, D. (2004). Dorsal and ventral streams: a framework for
understanding aspects of the functional anatomy of language. Cognition, 92,
6799.
Ingram, D. (1974). Phonological rules in young children. Journal of Child Language,
1, 4964.
Jakobson, R. (1941/1968). Child language, aphasia and language universals, trans.
A. R. Keiler. The Hague: Mouton. (Orginally published as Kindersprache, Aphasic
Johnson, M. (1987). The body in the mind. University of Chicago Press.
(2007). The meaning of the body: aesthetics and human understanding. University of
Chicago Press.
Kent, R. (2007). In the mouths of babes: anatomic, motor and sensory foundations of
speech development in children. In R. Paul (ed.), Language disorders from a
developmental perspective: essays in honor of Robin S. Chapman, pp. 5581.
Mahwah, NJ: Lawrence Erlbaum.
Keren-Portnoy, T., Vihman, M. M., DePaolis, R., Whitaker, C., and Williams, N. M.
Journal of Speech, Language and Hearing Research, 53, 128093.
Kirchner, J. A. (1987). Laryngeal reex systems. In T. Baer, C. Sasaki, and K. Harris
(eds.), Laryngeal function in phonation and respiration, pp. 6570. Boston: Little,
Brown, and Company.
Leopold, W. F. (19301949). Speech development of a bilingual child. Evanston, IL:
Northwestern University Press.
Lepage, J. F. and Thoret, H. (2006). EEG evidence for the presence of an action
observation-execution matching system in children. European Journal of
Neuroscience, 23, 250510.
Lewis, M. M. (1936/1975). Infant speech: a study of the beginnings of language. New
York: Arno Press.
Marshall, P. J., Young, T., and Meltzoff, A. N. (2010). Neural correlates of action
observation and execution in 14-month-old infants: an event-related EEG desyn-
chronization study. Developmental Science, 14, 47480.
Matthei, E. (1989). Crossing boundaries: more evidence for phonological constraints
on early multi-word utterances. Journal of Child Language, 16, 4154.
McCune, L. (1992). First words: a dynamic systems view. In C. A. Ferguson, L. Menn,
and C. Stoel-Gammon (eds.), Phonological development: models, research, impli-
cations, 31336. Parkton, MD: York Press.
(1995). A normative study of representational play at the transition to language.
(2008) How children learn to learn language. New York: Oxford.
McCune, L. and Vihman, M. M. (1987). Vocal motor schemes. Papers and Reports in
(2001). Early phonetic and lexical development. Journal of Speech, Language and
Hearing Research, 44, 67084.
McCune, L., Vihman, M. M., Roug-Hellichius, L., Delery, D. B., and Gogate, L. (1996).
Grunt communication in human infants. Journal of Comparative Psychology, 110,
2737.
McCune-Nicolich, L. (1981). The cognitive basis of relational words. Journal of Child
Language, 8, 1536.
Menn, L. (1983). Development of articulatory, phonetic and phonological capabilities. In
Menn, L., Schmidt, E., and Nicholas, B. (This volume). Challenges to theories, charges
to a model: the Linked-Attractor model of phonological development.
Nelson, K. (1973). Structure and strategy in learning to talk. Monographs of the Society
for Research in Child Development, 38, 12.
Nystrm, P. (2008). The infant mirror neuron system studied with high density EEG.
Social Neuroscience, 3, 33447.
Piaget, J. (1962). Play, dreams and imitation. New York: Norton.
Piaget, J. and Inhelder, B. (1969). The psychology of the child. New York: Basic Books.
Priestly, T. M. S. (1977). One idiosyncratic strategy in the aquisition of phonology.
Pulvermller, F. (2002). The neuroscience of language. Cambridge University Press.
Ramsay, D. and Campos, J. (1978). The onset of representation and entry into stage 6 of
object permanence development. Developmental Psychology, 52, 78597.
Rizzolatti, G., Camarda, R., Fogassi, L., Gentilucci, M., Luppino, G., and Matelli, M.
(1988). Functional organization of inferior area 6 in the macaque monkey. II. Area F5
and the control of distal movements. Experimental Brain Research, 71, 491507.
Rizzolatti, G. and Craighero, L. (2004). The mirror-neuron system. Annual Review
Neurosciences, 27, 16992.
Rizzolatti, G., Fadiga, L., Fogassi, L., and Gallese, V. (2002). From mirror neurons to
imitation: facts and speculations. In A. N. Meltzoff and W. Prinz (eds.), The
imitative mind: development, evolution and brain bases, pp. 24766. Cambridge
University Press.
Rosch, E. (2000). Reclaiming concepts. In R. Nunez and W. J. Freeman (eds.),
Reclaiming cognition: the primacy of action, intention and emotion, pp. 6177.
Cambridge, MA: MIT Press.
Sartre, J.-P. (1948/1962). The psychology of imagination. New York: Philosophical
Library.
Searle, J. (1992). The rediscovery of the mind. Cambridge, MA: MIT Press.
Stich, S. (1992). What is a theory of mental representation? Mind, 101, 24361.
Tettamanti, M., Buccino, G., Saccuman, M. C., Gallese, V., Danna, M., Scifo, P.,
Fazio, F., Rizzolatti, G., Cappa, S. F., and Perani, D. (2005). Listening to action-
related sentences activates frontal-parietal motor circuits. Journal of Cognitive
Neuroscience, 17, 27381.
Thelen, E. (1989). Self-organization in developmental processes: can systems
approaches work? In M. R. Gunnar and E. Thelen (eds.), The Minnesota symposia
on child psychology: systems and development, vol. 22, pp. 77117. Hillsdale, NJ:
Lawrence Erlbaum.
(1991). Motor aspects of emergent speech. In N. A. Krasnegor, D. M. Rumbaugh,
R. L. Schiefelbusch, and M. Studdert-Kennedy (eds.), Biological and behavioral
determinants of language development, pp. 33962. Hillsdale, NJ: Lawrence
Erlbaum.
Thelen, E., Corbetta, D., and Spencer, P. (1996). The development of reaching during the
rst year: the role of movement speed. Journal of Experimental Psychology:
Human Perception and Performance, 22, 105976.
Thelen, E. and Smith, L. (1994). A dynamic systems approach to the development of
cognition and action. Cambridge, MA: MIT Press.
458 Lorraine McCune
Thelen, E. and Smith, L. (2007). Dynamic systems theories. In W. Damon (ed.),
Handbook of child psychology, pp. 258311. New York: Wiley.
Tucker, D. M. (2002). Embodied meaning. In T. Givon and B. M. Malle (eds.), The
evolution of language out of pre-language, pp. 518. Amsterdam: John Benjamins.
Varela, F. J., Thompson, E., and Rosch, E. (1991). The embodied mind. Cambridge: MIT
Press.
Vihman, M. M. (Forthcoming 2014). Phonological development: the rst two years, 2nd
edn. Oxford: Blackwell.
Vihman, M. and Croft, W. (2007). Phonological development: toward a radical
Chapter 2.
acquisition. In M. D. Smith and J. Locke (eds.), The emergent lexicon: the childs
(2000). The construction of a rst phonology. Phonetica, 57, 25566.
Vihman, M. M., Velleman, S., and McCune, L. (1994). Howabstract is child phonology?
Towards an integration of linguistic and psychological approaches. In M. Yavas
(ed.), First and second language phonology, pp. 944. San Diego, CA: Singular
Ward, N. (2004). Pragmatic functions of prosodic features in non-lexical utterances.
Speech Prosody, 3258.
(2006). Non-lexical conversational sounds in American English. Pragmatics and
Cognition, 14, 11384.
Werner, H. (1957). The concept of development from a comparative and organismic
view. In D. B. Harris (ed.), The concept of development. Minneapolis: University of
Minnesota Press.
Werner, H. and Kaplan, B. (1963/1984). Symbol formation. New York: Wiley.
17 Challenges to theories, charges to a model:
the Linked-Attractor model of phonological
development
Lise Menn, Ellen Schmidt, and Brent Nicholas
I. Introduction: why its time for a new model of phonological
development
A. Contemporary setting: the Linked-Attractor model
as a usage-based model
Usage-based (bottom-up, emergentist) models of phonology (e.g., Boyland,
2009; Bybee 2006, 2010; Johnson 2006; McMurray, Cole, and Munson 2011;
Peperkamp 2003, Pierrehumbert 2002, 2003), based strongly in laboratory pho-
nology and computational simulation, have become increasingly elaborated and
convincing, and they are of tremendous importance for child phonology. Probably
their most important contribution, and the one that is the focus of this chapter, is
that they permit us to consider the representation of a words form as something
that develops continuously over time in strength, precision, and accessibility a
reconceptualization of representation which has long been psycholinguistically
necessary for really understanding language development. And these models
allow us to bring frequency data to bear on development without ignoring the
equally important contributions of linguistic structure.
This volume as a whole belongs to the growing literature supplying evidence
for usage-based phonological development; that is, the data and arguments
supporting the claim that childrens phonology emerges principally from what
they hear and try to say, rather than from an innately guided grammar. Of course
there are universals, because the ambient languages are subject to general
constraints on their structures; additional constraints are imposed by the limi-
tations of the human infants articulation, memory, and perception. But the
chapters in this book, as well as other cross-linguistic work, show that differ-
ences across languages and individuals cannot be regarded as minor disturban-
ces to some basically deterministic pattern.
Usage-based models also offer ways to deal with pervasive phenomena that
older phonological theories treat as marginal e.g., effects of specic speaker
Thanks for many hours of discussion to members of the psycholinguistics and phonetics/phonology
community at the University of Colorado, notably Al Kim, Rebecca Scarborough, Les Sikos, Jill
Dufeld, and Bhuvana Narasimhan; also to Ben Munson, Janet Pierrehumbert, Ronnie Silber, and
Carol Stoel-Gammon; and especially to our editors Tamar Keren-Portnoy and Marilyn M. Vihman.
460
and of language-particular phonetics. They also raise and reframe fundamental
questions about the distinction between phonetic and phonological levels of
description, something that has long needed attention (really, how do children
learn that phones which are in complementary distribution can all be allophones
of a single phoneme? Is it possible that they dont learn any such thing?)
although consideration of these topics lies well beyond the current chapter.
Vihman and Crofts (2007) paper on Radical Templatic Phonology, reprinted
as the rst chapter in this volume, is the rst major attempt to bring usage-based
modeling into contact with substantial amounts of production data from
children developing their rst phonology. However, the Radical Templatic
model needs elaboration so that it can deal explicitly with how and why the
huge gap between a childs production and its adult target gradually closes as
she becomes a uent adult speaker of her language.
The present chapter concludes the volume by sketching our Linked-Attractor
model of phonological development (Menn, Schmidt, and Nicholas 2009). Our
model extends the Radical Templatic approach beyond its focus on the childs
developing sensory-motor output representations, augmenting it so that it can
also handle three other basic aspects of phonological representation: the childs
developing input representation of the adult model word, the web of relation-
ships among the input representations, and the childs auditory representation
of her own productions. Such an extension is necessary to account for an
individuals development of an adultlike phonology from prelinguistic begin-
nings; an important paper already moving in this direction is Munson, Edwards,
and Beckman (2012).
In a sense, the Linked-Attractor model brings together the two-lexicon model
(Menn 1983, also reprinted in this volume) and the Radical Templatic model
as ingredients in a new model of child phonology that is compatible with
current usage-based models of adult phonology, and that can grow into an
adult phonology through its experiences of speaking, storing, and understand-
ing words.
We do want to make it clear from the outset, however, that there is no reason
to stop describing child phonology in whatever terms make a particular phe-
nomenon easiest to think about. Rules have an intuitively transparent precision;
constraints describe fundamental regularities, which exist in tension with
the lumps induced by individual experience. The Linked-Attractor model
complements and enriches the insights created by generative and harmonic
approaches to phonology.
B. Historical setting: rules, constraints, abstraction
Child phonology historically concerns what happens to the childs representa-
tion of the adult surface form of a word when the child tries to say it (Kiparsky
and Menn 1977; Smith 1973; Menn 1983). For that reason, child phonology
rules were initially written from an adult-centered point of view: as formal
Challenges to theories, charges to a model 461
descriptions of the differences between the adult word and the child word.
This practice was partially justied by the observation that children appeared
to honor more distinctions in perception than in production; therefore, they
might have a complete representation of the form of the adult words they were
attempting (though Waterson [e.g., 1971] and Ingram 1974 thought otherwise).
If a child said [dk] for stick or [don] for stone, we called it [s]-deletion and
wrote [s] > |#_C.
Lets reviewa fewkey points about differences between child phonology and
general phonology in familiar theoretical frameworks. Generative Phonology
(Chomsky and Halle 1968) and its direct descendant, Autosegmental Phonology,
as well as the newer constraint-satisfaction approaches to phonology, Optimality
Theory and Harmonic Grammar (see Kager 1999; Kager, Pater, and Zonneveld
2004), were developed for the idealized adult speakerhearer. Both of these
types of theory operate in terms of what happens to an abstract underlying
segment in a particular phonological environment. How these abstract under-
lying forms get into the adults mind is not addressed that job has been left to us
developmentalists.
Because child phonology rules relate the adult surface form (or the childs
representation of it) to the childs surface form, a words underlying form in
child phonology is not nearly as abstract as its underlying form in Generative
Phonology or Optimality Theory. Achilds underlying formhas been abstracted
from what the child takes to be different tokens of the same word, so its
essentially the same as the words surface phonemic representation. In contrast,
the underlying forms of classical adult phonology are intended to account for
morphophonemic alternations (captive, captivity; critic, criticize), so they are
fairly remote from the phonemic surface.
In spite of this essential difference in the abstractness of childrens and adults
underlying forms in generative and constraint-based phonologies, the relation-
ships between underlying and output phonological patterns in the two cases
are similar enough for standard tools of phonology to have been able to bring a
reasonable amount of order to both kinds of data. After the reel-to-reel tape
recorder and the formalism of generative phonology converged to allow child
phonology to become more than a theoreticians toy, rules brought a substantial
amount of order into a messy little world, at least for children who had relatively
regular mappings from adult word to child word (air-brushing out the phonetic
details). The enormous power of generative rules handled physiologically
mysterious patterns like metathesis almost as routinely as it dealt with physio-
logically plausible ones like assimilation and deletion. (In retrospect, that was
not really a good thing, but thats a different story.)
However, the present volume documents a substantial range of phenomena
where familiar types of rules and constraints do not work smoothly enough to
bring order or insight. The problem of unruly child word forms is not new;
they were discussed very early (famously, by T. M. S. Priestly 1977, reprinted
in this volume), but there was no tool for dealing with them as mappings from
462 Lise Menn, Ellen Schmidt, and Brent Nicholas
adult form to child form. So unruly mappings were set aside, and researchers
only dealt with these childrens output forms; the patterns in these outputs
were variously called prosodies (Waterson 1971, this volume), canonical
forms (Ingram 1974, Macken 1978, Menn 1983, this volume) or templates
(Vihman 1996).
Rule-ordering was designed to handle some kinds of subregularities among
exceptions in adult language, and it could do the same for child phonology.
Interestingly, for most children, examples of rule ordering involving more than
one or two words are hard to nd. Smith (1973) see especially pp. 1322 and
158 ff. provides the major published set of ordered rules, arguing for their
ordering with the standard tools of phonological theory. One of the most
important types of rule ordering is when Rule A has to apply before Rule
B because it removes a segment that would otherwise be input to Rule B (this
is called bleeding order it bleeds off part of the potential input to a rule). To
write an example informally, Amahls cluster simplication rule /sw/ > [w]
precedes the labial postponement rule /CwVC/ > [CVC
+labial
], because sweetie
becomes [widi] while quit becomes [kip]. (If the labial postponement rule had
applied rst, the output for sweetie would be [sipi].) Such opaque patterns are
still easier to handle with ordered rules than with output constraints or any other
device.
In some cases, constraint theories like Optimality Theory can describe the
inputoutput relations quite well and this sometimes requires constraints that
appear to be unattested in adult language. For example, a number of children
acquiring English, including Hildegard (Leopold 193949), Patrick (P;
Waterson 1971), and Daniel (Menn 1971), had a top-ranked (i.e., unbreakable)
markedness constraint that sibilants may appear only in word-nal position. For
other examples, presented in the Optimality Theory framework, see Bernhardt
and Stemberger (1998).
The question of howrules work in real time and whether/in what sense they
are real has been a vexing issue for all approaches to child phonology (and
general phonology). While a sequence of ordered phonological rules or a grid of
ranked constraints has never been claimed to describe a set of events that take
place in real time, the input-to-output formalismtempts users to think of themin
that way. But children often delay in applying new rules to existing words. This
fact is strong evidence that they store (partial) articulatory specications of at
least the output forms that are not obeying the new rules in an output lexicon,
rather than completely creating them online by rules acting on stored input
forms. (If they arent doing that, they must be storing the information about
which rules apply to each word along with the word. We wont pursue that
descriptive option here.) U-shaped developmental curves and phonological
idioms also tell us that at least some output forms are established, and therefore
that they are stored in some sense (though not necessarily in every detail). So
the mappings from input to output dont need to act in real time or even to be in
the childs head, any more than rules relating Latin forms to their French
descendants do; they may simply be the observers account of correspondences.
But that doesnt seemto be the whole story, because some child phonology rules
do seem to work in real time. How can a rule be both online and ofine at the
same time?
Menn (1983) broke a childs internal rules into two types: selection rules and
articulatory rules. The articulatory rules (which were never fully described)
were invoked to specify automatic details of articulation that would apply to all
the childs words, such as nal devoicing, giving them their nal output forms;
they were considered to be online processes. The selection rules, on the other
hand, were used to describe howa child selects which items of information about
the adult word to preserve in her output form, and which information to abandon.
For example, if the target sh has output /ps/, the information that the word is a
monosyllable with onset obstruent labial, stressed lax high front vowel, and nal
obstruent sibilant appears to be preserved, while the information that the initial
consonant is a fricative and that the sibilant is post-alveolar has been abandoned.
When selection rules like these operate between two stored levels that is, when
the child is familiar with the target word and is well practiced at producing the
output form they are ofine descriptions of relationships.
But selection rules also sometimes act as real-time maps from the way a child
hears something to how she says it; even some rules that have exceptions seem
to work in real time. LMs two-and-a-half-year-old son Danny, hearing jeep for
the rst time, happily and reliably repeated it as /bip/; his rules evidently applied
online for dozens of new words over several months (LM could and did say to
her friends: Watch, hes going to say your name as . . .). When we observe a
child immediately picking up a new word and saying it condently according to
her established mapping patterns, it seems reasonable to suggest that those
patterns comprise a set of procedures operating in real time, procedures that
link how a word sounds to how it will be pronounced.
So whatever does the job of the old selection rules in a newmodel must explain
howsome mappings function both as an ofine representation of the information
about a word that a child preserves in output and as a real-time mapping that
connects what I hear to how Im going to move my articulators.
This dual role of rules (online/ofine) is not just a problem for child phonol-
ogy; its been around in generative phonological theory since Halles Sound
Pattern of Russian (1971) tried to minimize descriptive redundancy by collaps-
ing morphophonemic rules with allophonic ones the programcarried out more
fully for English, of course, by Chomsky and Halle (1968), and also envisioned
by constraint theories. But we (and I think usage-based theorists in general)
would argue that dual roles, and maybe dual representation for some kinds of
mapping patterns regardless of the formalism used to express them is not the
serious problem that it was thought to be fty years ago.
The classical goal of generative and constraint-based phonology is to describe
phonologies while minimizing the redundancy of the description. But by nowwe
know that the brain is full of redundant systems not just because evolution is
inelegant, but because only redundancy can ensure reasonably reliable process-
ing under noisy conditions. Elegant, minimal systems are not neurologically
realistic. The Linked-Attractor model is unapologetically redundant in its attempt
to be psycholinguistically realistic, so it does not collapse online and ofine
processes that do the same thing just because its more elegant to state general-
izations only once. Being able to capture subtle patterns in the data is a vital
criterion for being a good theory; parsimony is nice, but it cant be allowed to
trump accuracy.
C. What does it mean to take a whole-word approach
to phonology?
As the termis used in this book, it has two components: 1. the exemplar claim:
learning a system starts from learning many individual examples, and 2. the
lexical claim: by about 8 months of age (and maybe earlier), the examples that
an infant learner takes as units are words (or word-like sequences), not sounds
or subword sound sequences.
Part of the evidence for word-based example-driven learning is statistical: the
sound patterns which children learn to recognize and produce are strongly
affected by the learning opportunities afforded by the words in the ambient
language not just what sounds occur and in what arrangements, but the
frequency with which a sound occurs, the probability that a sound will occur
in a particular phonological environment, and the number of other words with
highly similar sequences of sounds (neighbor words or simply neighbors).
For example, Zamuner, Gerken, and Hammond (2004) show that young chil-
dren can repeat the nal C of a CVC monosyllable more accurately if the initial
CVof the syllable is a common CV sequence rather than a relatively rare one
(controlling for the frequency of occurrence of the Cand the V) (see also Storkel
2001). This transition-probability effect is plausible if we regard learning
language as having a great deal in common with learning other things; for
example, it is parallel to the difference between repeating a phone number with
a familiar area code vs. a novel one if the area code is familiar, repeating the
rest of the digits becomes much easier. The reader will be able to think of many
parallel examples. (Thanks to Ronnie Silber for the analogy. For reviews,
arguments, and lists of citations see Edwards, Beckman, and Munson 2004,
Munson et al. 2012, and Stoel-Gammon 2011.)
Experimental studies of adult language support a whole-word view as well:
for example, Coleman and Pierrehumbert (1997) show whole-word effects in
adult judgments of how acceptable a nonword is (p. 8): When statistically
valid data on acceptability [of a pseudo-word] is gathered . . . it is found that
deviations [from phonotactic well-formedness] are partially redeemed by good
parts, and that forms which are locally well-formed, in the sense that each piece
is reasonably well-attested [for example, /sll/], can nonetheless be viewed as
improbable overall. They argue that neither classical generative phonology nor
Optimality Theory predict these results.
Generative and constraint-based grammars, as developed so far, are purely
structural: although phonotactic frequencies can be used in Optimality Theory
and its close relatives as evidence for constraint rankings, neither generative
nor optimality approaches allow linguistic phenomena to be sensitive to word
token or type frequency, which are whole-word phenomena in the second sense
above. However, token-frequency-dependent effects on how words are pro-
nounced are documented for adults. They are, in fact, the cornerstone of the
usage-based phonology literature (e.g., Bybee 2001). To take just two exam-
ples, the auxiliary verb can may be reduced to [kn
] but the noun can, even when

it is unstressed, does not lose its vowel; the refusal no has a huge range of variant
forms that is quite different from those available to the verb know or the
adjective no.
Most of the chapters in this book present other kinds of evidence for usage-
based phonological development. For example, some of them show that the
variety of ways in which children start to produce speech sounds is better
described in terms of ambient-language effects than by universal patterns.
Wauquier and Yamaguchis chapter argues that the distinctive prosodic structure
of French affects childrens prosodic development in ways that are incompatible
with proposed universals of prosody, and Savinainen-Makkonens chapter argues
that children learning Finnish are attracted by its conspicuous long consonants,
regardless of proposed universal markedness constraints (long consonants are
considered marked which is supposed to mean both rare and difcult to learn;
see Hume 2008). These results are general effects (across children and/or across
the lexicon), which Optimality Theory can probably handle. Some other fre-
quency effects, however, like the ones we introduce later in this chapter, are
lexical (morpheme-specic), and would require substantial modications to OT.
Like the reprinted classic papers by Waterson (1971) and Priestly (1977) that
we have already discussed, Szreders chapter shows that some childrens
patterns for producing words cannot be described in terms of well-dened
phonological environments or well-dened outcomes. This degree of indeter-
minacy is intolerable for a rule-based acquisition theory, because phonological
rules are supposed to be well-dened changes taking place in well-dened
environments. However, constraint theories might be able to handle under-
specied recipes like in two-syllable words with two or more consonants,
preserve the initial consonant as the word-initial consonant and preserve one of
the non-initial consonants. If it is in the same place of articulation as the initial
consonant, preserve its position in the word; if it isnt, put it in word-nal
position and put /j/ intervocalically (this being our attempt to summarize the
variability of monster > [majs/mjan/] and the other forms documented in
Priestly 1977).
Constraint theories can also be modied to work probabilistically (Davidson,
Jusczyk, and Smolensky 2006, Legendre, Hagstrom, Vainikka, and Todororna
2006), but they do not (or not yet) countenance descriptions of environments
like anywhere else in the word, or global constraints like not homonymous
with any other word. (While most children dont seem to avoid homonyms, a
few do: LMs older son Stephen was a strict CV child for his rst few months of
speaking, and in all words beginning with singleton consonants, the C was the
initial consonant of the adult word, with one exception. The emotively loaded
word nice as in nice baby, nice teddy bear had been established early as /nai/;
when Stephen learned knife a little later, he produced it as /fai/.)
Lets look further into such theory-challenging data sets, focusing on those
presented in this book. We can divide their difculties into three categories,
which can be labeled unruly contextual effects, lexical identity effects, and
wild variation.
The unruly contextual effects are still purely phonological, in the sense that
they depend only on the collection or conguration of sounds in the word; they
are unruly in that they defy description in terms of sets of well-formed rules
or constraints. The examples at hand are the Priestly example that we have just
summarized, the attraction to long consonants found in Finnish (Savinainen-
Makkonen), the varying effects of difcult sounds and clusters on other
sounds in the word in Polish (Szreder), the intricately competing ways of
dealing with two positions of articulation within one word (Menn, English),
the labial-V-alveolar-V template of Mackens (1979) classic paper on Spanish
(reprinted in this volume and summarized in Vihman and Crofts chapter), and
the effects of the presence of particular segments in particular positions in
syllables (Waterson, English).
Lexical identity effects, on the other hand, are not purely phonological,
because they depend on the particular word, not just on the conguration of
sounds that compose it; Stephens /fai/ instead of /nai/ for knife is one example.
The best-known lexical identity effects in child language are those that depend on
the history of the particular word in the childs vocabulary, particularly on when it
was acquired relative to other words. Phonological idioms (originally dened in
Moskowitz 1970; see nowOliveira-Guimares, this volume) are the classic cases
of such historical lexical identity effects. By denition, phonological idioms
are exceptions to the rules that apply to most of the other words that the child
produces; often, they are persisting forms of the childs earliest words.
Lexical identity effects probably depend, in differing ways, on a words
frequency in the childs input and output. Our model predicts that phonological
idioms will be words that have high output frequency relative to both the output
of other words and their own input frequency, so that their output forms become
highly entrenched and have relatively less inuence from the adult model. In
contrast, forms that are the leading edge of change towards the adult model
will have relatively high input frequency relative to both the input frequency of
other words and their own output frequency; this will make their input repre-
sentations relatively robust without entrenching their output representations.
(Its not possible to gauge the relative weights of the input and output frequency
factors on purely theoretical grounds.) Studies that can provide enough quanti-
tative data to test these predictions are hard to nd, but Ota (2006, this volume)
nds that high input frequency indeed predicts innovation (less truncation),
when phonological structure is held constant. (Sample size limits made it
impossible to get a reliable estimate of relative output frequency.) Ota and
Green (2013) also found input lexical frequency effects. The y-on-the wall
data described in Roy (2011) would be able to test the model, and LM is
currently trying to transcribe enough input and output data from the LENA
Foundation to do so also.
Phonological idioms are not the only challenging lexical identity effects in
child phonology. Perhaps the phenomenon that is most disturbing to an orderly
phonology is found in Ferguson and Farwell (1975, reprinted in this volume).
Ferguson and Farwell documented, for several beginning talkers, an unexpected
pattern of token-to-token variation for initial consonants, often crossing adult
phoneme boundaries such as b/d. The variation was greater for some words,
less for others, and never quite comparable from one word to the next. This lack
of comparability across words challenges the notion that their subjects had any
such thing as phonemes, or even phones i.e., subword units that would be
comparable from one word to another. Unfortunately, there are problems with
this conclusion because of the small amount of data (even in the full data set: see
Appendix 1, reprinted in this volume, pp. 11624); one cant be sure that the
tokens observed really came from statistically different distributions. Analysis
of very large phonetically transcribed corpora is essential to test whether this
challenge to standard ideas is as important as we think it is.
A more difcult lexical identity problem is shown by the case of a child with
specic language disorder whose data are presented in the Appendix to this
chapter (see also Menn, Schmidt, and Nicholas 2009). Her word collection
dees organization; the output for each word is stable, but it cant be predicted
from the adult model.
Finally, there is the matter of wild variation; in addition to Szreders data,
much of what Vihman has published demonstrates the instability of the word
forms produced by beginning speakers (see Appendix C of Vihman 1996 and
Appendix II of Menn and Vihman 2011). Yet not all children show this kind of
variation, which makes it difcult to attribute it solely to early lack of articu-
latory control. We will come back to this problem as we explore the next issue:
the question of a words mental representation.
II. Lexical representation
A. What is the representation of a word and where
does it come from?
An under-acknowledged problem in linguistics, and even in psycholinguistics,
is that we use the term representation as in mental representation,
underlying representation, surface representation, semantic representation,
etc. without discussing the concept of representation itself. Howis a linguistic
representation like or unlike the representation of other kinds of things in
our minds? How do children and adults come to have these representations in
the rst place? What are their properties? Exemplar-based developmental
theory posits that representations are built up by sensory experience, and
that abstractions come from the way these concrete memories accumulate
and interact. As Munson et al. (2012: 289) say, Individuals knowledge of
speech sounds comprises representations of information in multiple sensory
domains, including representations of the auditory characteristics of the
sounds that they have produced and have heard others producing, of the visual
characteristics of the sounds they have seen others producing, and of the
tactile, kinaesthetic, and somatosensory characteristics of sounds that they
have produced. (See also Vihman, Velleman, and McCune 1994; Vihman,
DePaolis, and Keren-Portnoy 2009.)
What are these multiple sensory memories like? How can we get a handle on
their bristling complexity, and what does our current understanding of the brain
suggest about how they might interact to produce the phenomena described in
the current volume?
A persons representation of a word (or anything else) is probably something
like the collection of everything he or she knows about it plus the collection of
all its connections to anything else. (Our denition differs from that of
McCune, this volume, but the implications of that difference are not clear.)
This web of information is obviously too much to think about at one time, so we
focus on a small part of it: the words referential meaning (which the child has to
discover from how it is used in the real world), its collocation patterns (that is,
the other words that it tends to be used with by adults and by the child), its
articulatory and acoustic phonetics, and the connections between these kinds of
information. (McCunes chapter gives a more communicatively oriented per-
spective on this web.)
In a usage-based model, grammar and phonology arise from accumulating,
comparing and contrasting memories of examples: again from Munson et al.
2012, p. 3: phonological development involves building progressively more-
abstract structures, starting with raw sensory encodings of the acoustic input
that are rst encountered in the womb, to the articulatory representations that
begin in the rst year of life, to the abstract representations that continue to
develop throughout the lifespan. How does this happen?
In current models of cognition, storing items in memory is not like storing
physical objects in drawers or ling cabinets or even like electronic les in a
Time-Machine-type backup system. Instead, the overlapping aspects of sep-
arate instances of an event (a particular dog wagging its tail, say) are made
stronger as they accumulate. More specically: memory is stored in networks
of interconnected brain cells, ultimately reaching the various sensory, motor
(muscle movement), and emotional areas where the original event impinged on
the organism. Connections form between sensations that occur together or in
tight sequence (Hebb 1949: What wires together, res together), and they get
stronger with each repetition of the stimulus event. So, for example, repeated
co-occurrences of hearing the high-frequency noise of the sound /s/, feeling the
position of your tongue, and feeling the air owing between it and your alveolar
ridge as you say it build tight links between the hissing sound and the sensations
that you feel in making that sound, as well as between those kinds of sensory
information and the motor (movement) instructions that you used to get your
tongue into that position from wherever it was an instant before. Furthermore,
new sensations get linked to memories of similar sensations for example,
hearing yourself say /s/ gets linked to your memory of hearing someone else say
it, because they arouse overlapping auditory areas of your brain. (This does not
mean that speaker identity information is discarded; see Johnson 1997.)
Links in the brains enormous library of connections can be asymmetrical in
many ways. For example, consider the connections between the phonetic
aspects and the semantic aspects of word representations. The meaning of any
word containing the sound /s/ (snake, castle, house. . .) becomes more tightly
linked to the sound and feelings of saying /s/ as you gain experience in saying
them. But conversely, as your vocabulary grows, each individual sound
becomes less tightly linked to the meanings of particular words, because each
sound recurs in so many words.
Meanwhile, within phonology, the sound and the articulatory gestures of /s/
in any word get more tightly linked to the /s/ in every other word, although
links are tighter to sequences that are more phonotactically and prosodically
similar (Pierrehumbert 2002). The articulatory gesture used to make the sound
/s/ also becomes linked fairly tightly to all of the gestures that can precede or
follow it, which is part of the basis of your tacit knowledge of your languages
phonotactic constraints (and why it can take so much articulatory effort to
overcome them later in life).
Munson, et al. (2012: 298), describing the same kind of connection forma-
tion, suggest that the emergence of abstract phonological representations
in childhood is tied to developmental changes in vocabulary size. One inter-
pretation of the mechanism that underlies this association is that increases in
vocabulary size lead to a reorganization of the lexicon along dimensions of
phonological similarity. These dimensions become de facto representations
of the sublexical units like phonemes and syllables. So the phonological
abstraction process can be seen as just the formation of extra-strong connections
across the representations of words when sensory or motor properties (or
other properties) happen to be shared by all (or by a substantial subset of) the
instances of those words, because of the fact that they share a particular sound
or sound sequence. This is what Vihman and Croft imply (p. 47) when they say
that abstraction is the automatic consequence of aggregate activation of high-
frequency tokens, with regression toward central tendencies as numbers of
highly similar exemplars accumulate.
Linguistic representations in a usage-based theory are not only complex cross-
modal objects; they are also dynamic: they can be weaker or stronger, and they
can grow, become more elaborate, and even fade over time. This is not just a
theoretical conclusion; it is supported by laboratory work in language acquis-
ition, for example, work from Aus second-language group at UCLA (Au,
Knightly, Jun, and Oh 2002; Oh, Jun, Knightly, and Au 2003, Knightly, Jun,
Oh, and Au 2003) and by work on infants by Vihman and Croft (2007). Another
example is an event-related potential (brain wave) study from Tokowicz and
MacWhinney (2005), which looked at beginning L2 learners of Spanish. When
these adult beginners were asked to judge whether Spanish sentences were
grammatical or not, they seemed to be guessing their grammaticality judg-
ments were correct only about 50 percent of the time. But when they simply
listened to those sentences while their brainwaves were being monitored, they
showed surprise brain wave patterns more often when they heard the ungram-
matical sentences than when they heard the grammatical ones. So their brains
must have begun to form representations of the ways words go together
in Spanish, even though this knowledge was still too weak to be accessed in
carrying out a judgment task.
Problems with the lexeme If we review standard psycholinguistic
terminology, including the terminology used in older developmental psycholin-
guistics, we can see that it is almost as bad as purely linguistic terminology in
handling the kinds of phenomena we have just discussed. Classical adult-based
psycholinguistics simply separates the representation of a word into two parts:
the lemma, taken to comprise what a person knows about the words meaning
and its syntactic properties (e.g., what part of speech it is) and the lexeme, that is,
the words phonological form. Some models of phonological development
(Kiparsky and Menn 1977; Menn 1983), went one step further: they used a
two-lexicon model, which subdivided the lexeme into the childs auditory/
recognition representation the input representation and her production
representation of it the output representation. As we have already mentioned,
this division was created to capture the observed disconnections between what
young children can recognize and what they can produce, especially the fact that
old well-established output forms of words may persist long after the child has
stopped applying the rules/constraints that originally created them. As we have
already noted, the easiest way to describe this is to say that the old forms are
stored in an output lexicon, rather than being generated online.
But this input/output subdivision of the lexeme is not ne enough either. The
two-lexicon model has rightly been criticized because it is poor at accounting
for the many kinds of variability in a words output form, such as variation
depending on imitation, on the complexity of the target, or on neighboring
words (for this last, see Menn and Matthei 1992). But that problem is trivial
compared to this one: the two-lexicon model says nothing about how its input
and output representations of words are created in the rst place: nothing about
the auditory properties of the input, nothing about motor and sensory feedback
from feeling and hearing oneself speak, no way of getting, storing, and using
information about the distance between what a child says and what the people
around her say. It is, in short, very far from being a psycholinguistic model.
Towards a better model The goal of this chapter is to propose a better
conceptual model of phonological representation and development, one in
which the childs experiences create her representations of words, sounds, and
phonotactic patterns. We essentially extend Vihman and Crofts template
model, which deals only with output, to the input/output complex that we
have just sketched. To motivate this extension, lets consider more of what a
words phonological representation has to be able to do, taking phonological
in the broad sense of all of the stored input and output information about a
words sound. First, it must support the words recognition (with more or less
context) and spontaneous production, including the way production varies over
short (moment-to-moment) and longer (developmental) time frames. In addi-
tion especially in children it must be able to change over both short and long
time windows in response to hearing productions from other people.
Some aspects of a phonological representation must be fairly close to a raw
sound data record, or we wouldnt pick up the details of the way other people
speak (Johnson 1997, Munson et al. 2012). Other aspects of a representation
must be ltered by top-down processing based on several kinds of previous
knowledge, such as phonotactic probabilities and what the speakers message is
likely to be.
What were not throwing away The Linked-Attractor model holds
onto four major conclusions distilled fromthe work of the last forty-odd years of
collecting child phonology data across languages:
1. Templates attractors for output forms are real: some beginning speakers
have output patterns with regularities that neither rules nor constraints can
fully capture, because the mappings frominput to output are too messy. There
are also input templates, attracting new percepts to stored ones; evidence for
this is phenomena like slips of the ear (Browman 1978), the perceptual
magnet effect (Kuhl 1991, cf. also Juszcyk 1993), and the difculty adults
have with perceiving less-probable phonotactic patterns (Brown and Hildum
1956; Coleman and Pierrehumbert 1997, Davidson et al. 2006, and many
other sources). Like Kuhl and Juszcyk, we will discuss input templates in
purely phonological terms, but that is only an approximation; one must go
well beyond phonology to account for top-down processing problems that
are found in slips of the ear, in childrens errors based on wordsound
associations (Vihman 1981) and in the difculty hearers have in processing
common nouns that are unexpectedly used as family names (such as author
LMs surname).
2. Output constraints that is, limitations on what sounds can be produced and
in what sequences and syllable positions are real. Output constraints
motivate rules: that is, most (but not all!) rules are descriptions of the system-
atic ways that a child renders the sounds of adult words so that they t into the
limited set of sounds and sound patterns that she has managed to learn to say.
Output constraints can also explain some kinds of inputoutput relationships
when the mapping from input to output is not rule-governed. Markedness
constraints that is, constraints against producing certain segments, glottal
features (like tones and voice quality), and sequences of segments and glottal
features exist largely because of the shape of the mouth and properties of the
respiratory system (Messum 2005), the clumsiness and unreliability of early
articulatory control, and the fact that children have to learn to overcome this
clumsiness in the service of learning the patterns of the language(s) around
them. Learning to overcome (or demote) particular markedness constraints
that are violated by the ambient adult language e.g., learning to produce
initial stop + liquid consonant clusters does not, in general, overcome other
constraints that happen not to be violated by the adult language. For example,
the constraint against syllable-initial s + consonant clusters that is found in
Spanish and Portuguese does not need to be overcome to learn those lan-
guages, and its very evident that it persists when L1 speakers of those
languages are confronted with s + stop clusters in other languages. In general,
then, constraints that are not violated by the ambient language remain in place
(in OT terms, top-ranked) unless we succeed in learning to pronounce
words in a language that does violate them.
Because children try to sound like the people around them and because
they are more likely to be understood if they succeed their forms also tend
to obey what Optimality Theory and Harmonic Grammar call faithfulness
constraints that is, they generally preserve as much of the sound pattern of
the model word as they can.
3. Stored output forms are real. As we have seen, a childs old, established
ways of saying words persist sometimes for a short time, sometimes for
many months after she has found new ways of saying very similar words.
This persistence of particular idiomatic forms cant be explained without
recourse to stored output forms (or to the equivalent in stored information
about which rules apply to which words), because these old forms dont
conformto new rules or to changes in the strength of constraints. (We will go
into this matter in more detail in Section VI.) We proposed above that this
persistence of entrenched forms is related to how often the child says them,
but we dont yet have quantitative data that could test this hypothesis.
4. Rules that is, regular mappings (Menn 1983 also used the term trans-
ductions) from what a child perceives to what she says are real. Its not
just that rules are descriptively handy for some children; rules are needed in
addition to constraints and templates because a rule can generalize to some
new forms even after the constraints that it enforced are being violated by
other new forms. An example of this from Menns longitudinal work is
Dannys slowly-eroding rule that added [s] to certain words ending in /r/,
documented exhaustively in Menn and Matthei (1992). This rule generalized
to some new forms even as some of the old words that it had applied to were
breaking free of it.
Constraints, even when they are combined with stored output forms, can
capture this persistence of inputoutput mappings only by the addition of the
powerful formal device of dividing lexical items into strata according to which
mappings/constraint rankings apply to them, and its very hard to see why a new
word would be added to a stratum when the constraints that dened that
stratum (e.g., no initial fricatives) are no longer active. Generative phonology
is more exible; Chomsky and Halle (1968) allow several formal devices,
including not only lexical stratication (elaborated in lexical phonology; see
Kaisse and Shaw 1985) but also rule-ordering and ad hoc exception marking,
which permit rules to have exceptions. But the most important reason for
keeping open the option of using rules is heuristic: a rule is a way of directly
describing inputoutput mapping, and if children behave as if they have direct,
accessible maps from how I hear it to how I say it, then its useful be able to
describe this behavior formally.
Some readers will feel that we dont need all four of these constructs, because
together they form a redundant system. However, we repeat that we are sketch-
ing a psycholinguistic model, not a purely linguistic theory. We emphasize
again that the brain and its workings are a redundant system, and that redun-
dancy is necessary for many reasons, such as ensuring performance in a world
of competing stimuli that produce noisy, unreliable signals. A realistic model
must therefore also be appropriately redundant. We will defend redundancy for
an even more fundamental reason in the next section.
III. The Linked-Attractor model
A. Overview: the Linked-Attractor model as a usage-based approach
to the development of phonology and lexicon
The Linked-Attractor model encodes the speakers phonological knowledge as
a set of three kinds of attractors: production/output templates, perceptual/input
templates, and a new kind of attractor: the mappings between input and output.
We propose that input/output mappings, whether they are well-dened rules
or less ruly connections, should be considered as attractors because ways
of mapping input to output seem to become entrenched, at the whole-word
level and also below (e.g., at the segmental level recall the online jeep > /bip/
example). Ultimately, such stored input/output mappings are what adults rely
on to say novel words (most obviously when those words are heard in a dialect
or regional accent that is not the one the speaker will use in saying them),
and they are also, at least in part, the source of our entrenched and nearly
ineradicable accents when we produce novel sequences of sounds in second and
later languages sequences for which we could not have preexisting output
templates. Constraints shape those mappings, and can account for most of the
same phenomena quite well, but its hard to see how constraints could handle
the frequency-based entrenchment of a particular inputoutput mapping.
All three of these kinds of attractors are built up incrementally by experience
with hearing, understanding, and speaking. Attractor strength can thus vary
along a gradient from minimal to strongly entrenched. This idea is central, and
it is liberating: it allows us to escape from an oversimplied world in which
a person either has or does not have a representation of a word, a speech
sound, etc.
Emergentist approaches to language development often use landscape or
gravity metaphors to describe the way that links between representations in
the brain get stronger each time they are used, in the same way that watercourses
are gradually worn into the earth by rain and that streams nd the shortcuts and
the steepest grades through elds and woods. An attractor is visualized as a low
place, one that water ows to (hence the chaos theory term attractor basin,
adopted by Vihman and Croft). Existing words sculpt a kind of landscape of
attractor basins; overlapping and more frequent words create deeper basins.
Preferred forms are lower in this topography than ones that are difcult;
impossible forms are the highest peaks and ridges.
Related metaphors in the phonological development literature include not
only gravity (pitfall, Menn 1983) but other forces of attraction (warp,
Jusczyk 1993; magnet, Kuhl 1991).
The acceptability of new words is affected by this landscape, sometimes in
ways that are hard to pin down (what exactly is wrong with Pierrehumberts
slill?). Some aspects of constraint-based theories can be accommodated in the
landscape metaphor, as we will see later. Of course, even a simple phonological
model will quickly need more dimensions than the three of our visible world,
so the landscape/contour idea must become much more abstract; we will
shortly join Pierrehumbert and others in moving our model to an N-dimensional
hyperspace.
B. The development of representation: input and output templates
Early in development, before an infant discovers lexical meaning, her exposure
to sound patterns in meaningful or playful interactions both the speech of
others and the sound of her own babbled output builds her rst perceptual
templates (auditory/acoustic attractors), creating the phonotactic and phonetic
recognition information that infants are known to possess before the end of
their rst year (Gerken 2009; Jusczyk 1997; Kuhl 1991; Maye, Werker, and
Gerken 2002; Saffran, Aslin, and Newport 1996; Saffran and Thiessen 2003;
Velleman and Vihman 2006).
Links start to formbetween clearly recurring situations especially those that
arouse strong emotions and the words that are used in those situations. As a
word like bath or up occurs in more diverse situations, the strongest connections
will come to be those linking whatever is common to most instances of the
words use and instances of the words sound. In this way, the process that
builds up links creates a cumulative average over all of the occasions of hearing
a word, so that the child is not led astray by the few occasions when she hears
Youre going to need a bath when we get home, with no bathtub in sight and
no sound of water running. These complex linkages become the input lexical
entries.
If we think of the lexicon as being created in this way, usage information,
speaker identity, and emotional coloring will be part of the information that is
stored for each word. The aspects of information that vary the least and have the
strongest emotional associations will become the ones most strongly repre-
sented (other things being equal).
Meanwhile, on the production side, early events of making vocal tract sounds
(both involuntary and voluntary) create links between an infants patterns of
breathing, vocal fold position, mouth movements, and the sounds that these
actions produce. Repeated events of more and more coordinated jaw move-
ments, tongue movements, and phonation (vocal fold vibration) accumulate to
create the childs babble routines (Vihman 1996), although they are continually
modied by her maturing neural control and anatomical development. During
the late babble period, the child begins to have the ability to make some of those
sound sequences voluntarily (although we still dont know how this happens).
Some of the sounds the child makes will overlap with and stimulate her auditory
memory of some of the words she has been hearing; this allows her to recognize
that she too can say those words a discovery that will be aided and conrmed if
her caregivers respond to the meaning of what she says.
Many of a childs early attempts to say words are likely to fall into the late
babble articulatory attractors, so late babble attractors typically become the
articulatory output routines for early words, although new templates are also
likely to be needed for some words. (Stoel-Gammon 1989 suggests that late
talkers may be children who fail to use babble-based patterns.) Each attempt
to say a word also builds or strengthens links between the childs representation
of that words articulation and what the child means by saying it, as well as
between those representations and the adult responses to it, etc. The result of
these accumulated word production experiences is the formation of stored
output lexical entries how to say familiar words, what they mean, and also
when to say them. The words in the early lexicon thus create the local patterns
that we describe with canonical forms or templates, as researchers from
Waterson (1971) to Fikkert and Levelt (2006) have seen when they looked
longitudinally at the early months of language production.
When a child imitates a word, her memory of how the adult said it becomes
linked to her own articulatory representation. Because she hears herself, imitation
also adds to her links between how she said the word and how she heard it from
the adult, as well as to the links between that phonological information and its
meaning or occasion of use. And each time that a child attempts to say a word
spontaneously, it adds to or strengthens her links from her intended meaning to
the articulatory representation of howshe said it, and to howit sounded when she
said it. (Following out the logic of all this leads to still more kinds of links,
including strengthening the links to accumulated memories of how adults have
said it.)
C. Building the mapping attractors
Strong links fromhearing to producing sounds or sequences of sounds constitute
most of what we write as child phonology rules; for example, a productive velar
fronting rule (go > [do]) is an automatized link between hearing a velar stop and
making an alveolar closure. Traditionally, we have omitted writing the rules
governing correct production (/b/ > [b]), and the hundreds of others that are
needed to produce the correct allophones in segmental and prosodic context.
Because these maps are cross-modal fromauditory to motor/kinesthetic, they are
not identity maps, although most acquisition accounts ignore them, not even
describing /b/ > [b] as a rule that the child must learn. The implicit idea that
these inputoutput maps are trivial is an illusion, caused by the adult-centered
point of view that writes rules only for what the child produces incorrectly, and
by the fact that we use the same notation for how a segment sounds and howit is
said. Some of these correct auditory-to-articulatory maps are established early,
and they may (and eventually will) become the most deeply entrenched of all.
As we said earlier, some kinds of links themselves apparently may become
attractors of other links; detailed longitudinal data (Menn and Matthei 1992)
suggest strongly that established rules can attract newinput > output mappings
that is, new mappings can be warped to become like old ones. The idea that
phonological rules can be attractors is an odd one for linguistics, but it follows
from the same reasoning that we have been using all along: everything that we
do becomes engrained with repetition, regardless of whether this involves links
within modalities (like purely auditory templates) or across auditory and motor
modalities (like babble routines/patterns or lexical entries).
The importance of individual developmental histories, and of individual
words within those histories, makes it impossible to have a valid theory of
child phonology that is just some special case of any extant general phonolog-
ical theory. This is why we are building a psycholinguistic model, not creating
a purely linguistic theory: a purely linguistic theory cannot deal with the
accidental language history of the individual child (e.g., family names, what
favorite toys and foods are called), any more than it can deal with the accidental
history of particular languages (the languages spoken by neighbors, conquerors,
peoples colonized, etc.). But if we want to understand development, we have to
be able to deal with the interaction of general factors and the accidents of a
persons history; we cant pretend that history doesnt happen and doesnt
affect the individual persons grammar. Neither can we ignore what humans
(and their grammars) have in common; individuals vary within an envelope of
universality.
Fromwholes to parts We have been discussing phonology as a matter
of how the representations of whole words form. How do parts of words
consonant clusters, segments, tones (in a tone language), rhymes acquire
their own representations? Network models assume that anything that is held in
common by large numbers of simultaneously aroused items (a rhyme, a segment,
a tone) develops a coherent sub-network across those items that is, it develops a
representation of its own. In a usage-based approach, forming subword phono-
logical units is the same kind of process as nding the words and the things
they refer to from multiple instances of hearing caregivers talk about baths,
blocks, hats, and so on. Each time a child says a word, words that have similar
sounds and/or similar articulations are aroused to some extent. Because of this
simultaneous arousal of similar forms, cross-connections will form between
similar items in this case, representations of several words. In other words,
ubiquitous phonological priming (cf. Cutler 2012) provides the neurophysiolog-
ical basis for phonological abstraction.
Frequency is never the whole story Network models use a fairly
sophisticated account of cumulative frequency of a word (or other unit).
Part of it is the cumulative numerical frequency of experiences with a word:
how often the various new and recalled experiences of the many components of
hearing, saying, and understanding that word have been aroused. But frequency
is far from being the only thing that matters. After all, total reliance on input
frequency was one of the things that made mid-twentieth-century behaviorism
such an inadequate theory of language, even for the acquisition of phonology;
for example, behaviorism had no way to explain the glaring fact that // the
commonest consonant in English is among the last acquired. A representation
of a familiar word is a rich complex of connections between many sensory,
motor, social, and emotional aspects of a persons experience from the prenatal
period to the present moment, so the strength of that representation depends
on the vividness of its components and on the strength of the connections
between them. If a sound or sound sequence is articulatorily demanding or
hard to hear (for example, because it tends to occur in unstressed syllables), the
odds are stacked against building up strong representations for it.
IV. Problems for the child, the researcher, and the model
Why has it been so hard to create an adequate theory or model for phonological
development? The rst problem is variability, which we explore in detail below.
Second, it is often hard to be faithful to the childs phonetics while trying
to illuminate her phonological system. Given the kinds of variation that we
see in childrens early output forms, it can be hard to gure out what kind of
abstraction from phonetics to phonology would be appropriate. As for
setting up underlying forms for a childs phonology, we rarely have percep-
tual or psycholinguistic data about an individual child, so we dont know
much about their internal representations of the sounds of the forms that
they are trying to match. The broad phonetic or phonemic transcription of
the adult word that we use as a practical substitute for the input representa-
tion may be an underestimate, overestimate, or a mismatch to the childs
actual internal system. For example, Dannys treatment of #/tr/ clusters
tree > [gi] (Menn 1971) indicates that his underlying representation of the
position of the initial stop was palatal (cf. Joe > [go]) or even velar, not
alveolar.
Third, we tend to think in terms of adult sounds and childrens approx-
imations to them as segment-sized static articulatory target congurations
(bilabial stop, alveolar nasal,. . .), but as gestural (or articulatory) pho-
nology (Browman and Goldstein 1992) emphasizes, saying a word isnt
saying one sound and then another; instead, articulating a word means
coordinating a set of simultaneous articulatory trajectories that is often
compared to the musical score for a small ensemble. For example, to say
pig, the ensemble of articulators and the vocal folds have to coordinate
roughly like this (reading the time axis from left to right, and omitting
many details):
/p/ /i/ /g/
Close lips Open lips, keep them open. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Bring vocal folds close together. . . Begin vocal fold vibration . . .Allow vibration to stop
Bring center of tongue towards hard palate Continue raising center/back of tongue to contact velum
Raise velum to close off nasal passage, keep it raised . . . . . . . . . . . . . . . . . .
It is hard but necessary to nd ways to think about such coordinated gestural
trajectories, not just about phonemes.
Fourth is the tension between making the model elegant enough to appeal to
users while allowing it to be complicated enough to represent a lumpy reality.
Whether the Linked-Attractor model can manage to be both reasonably attrac-
tive and reasonably accurate remains to be seen.
A. The gorilla in the playroom: variability
A metaphorical gorilla in the room is something thats too big to ignore, no
matter howmuch one would like to do so. In child phonology, the pervasiveness
of variability in dimension after dimension is too big to keep ignoring. The
goal of a theory is to nd the unity behind the observed variation, but when
variability is a major property of the data, the theory itself has to face it and
account for it. In child phonology, instance-to-instance variability appears
to be much greater than in general phonology, and sociolinguistic factors
(even if we extend social to include whether the word is imitated or not)
dont come close to accounting for it, nor does the formal device of separating
words into lexical strata with different rankings of constraints or different
sets of rules applying to each stratum. A childs phonological system is
unstable, and the fact that it is developing quickly, both anatomically and
psychologically, contributes to this instability. The childs articulatory tract
geometry changes dramatically as she grows (especially between birth and
24 months; see Kent and Vorperian 2007), her articulatory skill increases
as she practices and gets better motor control, and something about the
mismatch between what she hears others say and what she says probably
partly automatic, partly conscious drives her system to move closer to the
way other people talk. Her comprehension/recognition vocabulary is also
growing rapidly, which may be pushing her to more precise representation
of contrasts between words and sounds, and she is also amassing a statistical
picture of the possible variations in the ways that other speakers produce
words. Furthermore, her rules connecting input and output are generalizing, as
well as being created and destroyed.
We list below four levels of pervasive variation in child phonology, and
then focus on the rst (token-to-token variation within the individual child)
and second (word level variation across the lexicon of a particular child),
because they provide much of the evidence for an instance-based develop-
mental model.
1. Token (instance) level variation We often see random-looking var-
iability from token to token within a very short time; sometimes, substan-
tially different versions of a particular word may be produced within a
few minutes, even when the word is not a new one in the childs output
vocabulary; an example you can hear on the Web is Deb Roys sons
older form [gaga] alternating with his newer form [wa | ta] for water (Roy
2011). Published studies include Vihman and Velleman (1989), Vihman,
Velleman, and McCune (1994), and Labov and Labovs (1978) longitudinal
diary study.
In toddlers further along in their language development, some words may
show a neater kind of alternation between two well-dened forms (see Oliveira-
Guimares, this volume). Some of this is alternation during transition from
an earlier output form to its successor in the childs development, but also two
forms may compete with each other over weeks or months with neither of them
clearly belonging to a newer pattern than the other (see the rule competition
section of Menn 1983).
2. Lexical level variation: mappings/rules/constraint orderings apply to
some words but not others Even when the output for each individual
word is fairly stable at the token-to-token level, there is usually variation from
one word (type) to the next in the rules that map the adult form to the childs
form. The rst modern discussion of word-to-word variation in the relationship
between the childs word formand the adult target formis Moskowitz (1970). In
that paper, Moskowitz coined the term phonological idiom to refer to the
words that conspicuously dont t into a childs system at a given point, either
because they are regressive (too primitive, being entrenched holdovers from an
earlier developmental period) or progressive (too close to the adult form, being
either a leading edge of new change or a single item that the child mastered but
never generalized). The example that Moskowitz made classic is Hildegard
Leopolds rst word, a whispered pretty (Leopold 1939). Pretty, which was the
only word that Hildegard produced with a voiceless stop and also the only one
produced with a consonant cluster, remained isolated from the rest of her
phonology for months; however, it eventually succumbed to her general system
and became [bidi]. So pretty is not only an example of the unpredictability of the
domain of a rule (that is, the set of forms to which a rule applies), but also a
perfect example of a U-shaped developmental curve.
Progressive phonological idioms have turned out to be common probably
most children have one or two. For example, Watersons Patrick (P) (Waterson
1976) had an early [dik] for stick, but then he developed a velar assimilation
pattern, producing truck, jug, and cake all as [gk]; stick then briey became
[gk] also (output attractor at work!), and shortly thereafter became established
as [gk].
Lexical variation in rule application is not necessarily limited to the presence
of one or two phonological idioms. Children who cram whole words into
templates, as documented by Waterson (e.g., 1971), Macken (1979), Priestly
(1977), and others discussed by Vihman and Croft, jettison a large part of the
information about the order of the sounds in their target words. When this
happens, rules relating the target words to their outputs cant be written at all
(except with the kind of tortured ad hoc ingenuity that phonologists are taught
never to use). This kind of word-to-word variation also makes it impossible to
order constraints uniformly across the vocabulary.
To make things worse, the assignment of an adult word to the childs canon-
ical form may be haphazard. For example, Watersons Patrick produced the
name Rooney with his nasal/voiced glottal fricative template for honey and
hymn/angel rather than with the repeated-palatal-nasal template that he used for
another, Randall, nger, and window. In more complex cases, such as the
phonologically delayed child Ellie (Menn, Schmidt, and Nicholas 2009;
Appendix to the present chapter), it may be almost impossible to make any
general statements about the way the child will render an adult word, even
though each word may be quite stable.
When ne-grained data permit us to look at rules/patterns that are in the
process of changing (Menn and Matthei 1992), the lexical quality of early
phonology is likely to become visible, even though across-the-board changes
(Smith 1973) are also attested. So the real story will involve trying to understand
why some changes are lexical (word-by-word) and some are general (describ-
able by rule change or constraint reordering). Given the relatively coarse
grain of most child phonology data, it may even be the case that all rule changes
will turn out to be word-by-word when we have enough data to examine them
closely. This problem for child phonology is very similar to the problem of the
propagation of new forms in historical language change: do phonemes really
change across the board, or is the process one of gradual lexical diffusion, with
across-the-board change being a possible end state (Labov 1981)?
Why wont lexical stratication handle childrens lexical variation? For one
thing, if assignment to a lexical stratum is not to be simply an ad hoc formal
device, there must be some basis for assigning a word to a stratum that is, to
the domain of a particular set of rules or a particular constraint ordering. In adult
phonology (for example, in the division of the English lexicon into learned
and native stem and afx morphemes), that basis is morphological. Could
lexical strata be set up for child phonology if the way a word is treated has a
basis in terms of the time it was rst produced? As we know, forms that have
been acquired early persist in their old patterns of inputoutput mappings,
along with a few later words that resembled them very closely, perhaps, while
later-acquired words are subject to a different set of rules. But the Menn and
Matthei paper reviewed in painful detail a major problem for both rules and
constraints in using this temporal information: the boundaries of old/new rule
domains expand and contract in a ragged, piecemeal fashion that appears to be
related to word frequency. So dividing the lexicon into strata could only provide
an arbitrary, ad hoc description of this kind of behavior.
Finally, a template-based theory can describe the situations where it is
impossible to predict whether a particular rule, constraint ranking, or output
template will apply to a particular word, but there is probably no way to make
such a lumpy description aesthetically satisfying.
3. Individual differences The third level of variation is that of individ-
ual differences across children. The work of Vihman and her collaborators (e.g.,
Vihman et al. 1994) shows that children learning the same language diverge
from one anothers sound patterns in late babble and may make very disparate
starts on words, only slowly reconverging on the ambient language that they are
all learning. There are major individual differences in how children attack the
problemof producing words for example, what sounds they preserve, whether
they focus on the beginnings or the ends of words, whether they are conserva-
tive and attempt mostly words that they can say fairly well, or are willing to
mash almost any word into one of their favorite output forms (Menn 1983).
Children also differ in the stability of their word tokens and in the reliability of
their rules (or in constraint rankings or in the attraction power of templates). As
we have already indicated, the word-by-word quality the lexicality of early
phonology appears to be much stronger in some children than others. Variability
of both of these kinds is especially likely to be found in children who have
developmental phonological problems, who are sometimes described as having
multiple sets of rules operating at the same time (Bernhardt and Stemberger
1998; Stoel-Gammon and Dunn 1985; cf. discussion in chapter 1 of Grunwell
1981).
4. Cross-language differences The fourth level of variation is the level
of the target language. Infant perception studies show prespeech learning of the
phonotactics of the ambient language becoming quite evident by nine months
(for reviews, see Gerken 2009; Jusczyk 1997). The impact of the ambient
language on early child output (the early mastery of geminate consonants in
Finnish and Arabic, the large proportion of monosyllables in English, the
dominance of CVCV disyllables in French and closed syllables in Arabic, the
effect of prominent nal syllables in French, the low proportion of CVC
monosyllables in French as compared to English, German, Estonian, and
Dutch, etc.) has been documented in many chapters of this book and in
references cited throughout; see also Peters and Menn (1993), Pye, Ingram,
and List (1987), Vihman (1996).
B. The elephant in the playroom: seeing the creature whole
The metaphorical elephant lumbers through ones conceptual world as some-
thing that keeps being too big and complicated for one person or one approach
to grasp as a whole. In the well-known fable, six blind men report very different
ndings from touching different parts of the elephant (its like a tree, a rope, a
wall, a bull. . .). This story bears on the relationships among the three theoretical
approaches to child phonology that we have been discussing: generative rules,
OT-type constraints, and templates. We claim that there is one real elephant,
phonological development, and that each of these perspectives looks at part of
it and furthermore, that the Linked-Attractor model is a good start on a way of
describing the whole creature.
Using multiple theories simultaneously is not a popular move in linguistics,
which is full of polemics about which theory is Correct. But why shouldnt we
take rules, constraints, and templates to represent different ways of looking at a
complex reality? Instead of being offended by the resulting descriptive redun-
dancy, compare it to the redundancy of looking at a three-dimensional object
from several perspectives and distances, or to looking at an astronomical object
using different parts of the electromagnetic spectrum, from radio waves through
visible light to gamma rays, or to regarding an electron as either a wave or a
particle, depending on the kind of interaction. Different properties of the object
are made clear from different aspects/at different wavelengths, but some of its
properties will show up in two or more of the perspectives. The object itself is in
fact redundant, because any structured object has redundancy, by denition if
theres no redundancy, what you have is chaos. (More examples of different
ways of looking at a complex reality: classic multi-layered transparent anatom-
ical drawings of the human body; the different kinds and different slice axes
of images provided by CT scans and fMRI images; the images along different
spatiotemporal axes generated from brain studies that measure event-related
potentials; maps at different scales or that highlight different physical and
political features.)
V. Assembling the elephant: elaborating
the Linked-Attractor model
As we said earlier, the Linked-Attractor model assumes that our brain builds up
auditory representations of input forms, articulatory/kinesthetic/sensory and
auditory representations of output forms, and inter-modal representations of
the mappings between them. These representations are created from the accu-
mulation of examples, as stronger and stronger connections between the thou-
sands of neurons involved in hearing, understanding, and speaking are built up
by repeated transmission of neural impulses.
Each perceived or produced-and-perceived form and each executed mapping
has the potential to become an attractor. Its strength will depend on a number
of factors; a major one will be how often the form or mapping is activated. (To
a rst approximation, this means how often it is used for hearing, speaking, or
thinking though these must make rather different contributions to the form or
mapping.) In the landscape metaphor, frequently activated items (including
mappings) become more deeply entrenched.
But how does a particular form attract neighboring forms into its basin or
similar mappings into its groove? What makes attracting neighbors possible is
spreading neural activation; in particular, the fact that activation of any partic-
ular item spreads collaterally to items that are similar to it. (This is a fact about
how our brains work, not something invented for the model.) Without this
spreading, every form would be, essentially, an idiom or an idiomatic mapping,
ignorant of and isolated from its neighbors; general rules like if the adult
monosyllabic word starts with a fricative + C cluster, put an [s] at the end of
your output form could not form from particular instances of mappings from
snow to [nos] and smoke to [mos]. Attraction, then, is the basis not just for
entrenchment, but also for generalization from one item to the next. In our
model, when generalization fails, as it does for Ellie in the Appendix, its
because activation, for some reason, has not spread collaterally.
Lets look closely at some data from Ferguson and Farwells Appendix 1,
reprinted in this volume, pp. 11624 (given in Table 17.1). Co-author BN
originally pointed out (Menn, Schmidt, and Nicholas 2009) that there are not
two but three kinds of attractors involved in building the childs phonology: one
built by the accumulation of input tokens for each word, one built by the
accumulation of output tokens, and a third one built by the repeated cross-
modal mappings from input to output. We visualize the attractors created by
input and output tokens as basins, and those created by repeated mappings as
grooves, analogous to increasingly worn footpaths or erosion gullies between
destinations.
Each attractor has the power to drag other representations towards itself (in a
landscape metaphor, it creates a basin or a groove): the more frequent the form,
the bigger the attractor basin or groove, other things being equal. This amount
of information, though obviously less than a complete representation of what a
speaker knows, lets us contemplate the robustness of each form, as a template
representation demands, and also the robustness of each mapping pattern. (For a
visualization, see the Nicholas diagram, Figure 1 in Menn et al. 2009.)
Usage-based theorists (e.g., Hume 2008) have become convinced that no
particular set of features is innate, so it must be the case that speakers are able to
draw on a large number of different kinds of information in order to eventually
form the classications and abstractions that are appropriate for the language(s)
they are acquiring. Munson et al. (2012: 289) put it this way:
The physical world in which humans reside is limited to four dimensions, but the mental
world in which our knowledge of language resides is not similarly limited. Individuals
knowledge of speech sounds comprises representations of information in multiple
sensory domains, including representations of the auditory characteristics of the sounds
that they have produced and have heard others producing, of the visual characteristics of
the sounds they have seen others producing, and of the tactile, kinaesthetic, and soma-
tosensory characteristics of sounds that they have produced. This information is repre-
sented at multiple levels of abstraction in multiple domains of interpretation.
In other words, a model needs dozens of dimensions in order to describe the
connections that must form among and between the sequences of things that
we do, feel, and hear each time that we say a particular word from beginning to
end, because it has to be able to represent all of the phonetic/phonological
ways in which the sounds and sound sequences in a word can be near or far from
one another. All the available kinds of sensory information (auditory-acoustic,
kinesthetic, visual, motor. . .) from our own and other peoples productions of
each word must be represented in the model in some way. Each of these kinds of
information has to be described by its own group of dimensions, though there
will be many intimate correlations among them. Other connections across
Table 17.1. Raw data from T, session I, Ferguson and Farwell (1973)
daddy /ddi/ (7 tokens) d
j (1/7) d (2/7) dd (2/7) dt (1/7) di (1/7)

dog /dg/ (5 tokens) d (5/5)
hi /hai/ (20 tokens) h: (2/20) ha (12/20)
h
a (3/20) a (3/20)
see /si/ (2 tokens) h
i
(1/2)
i
(1/2)
examples must also form, if our minds are to nd the patterns of our language
instead of just collecting swarms of small regions of information: connections
between similar instances of the same word, between words that have similar
sounds, between words that have similar articulations.
A two-dimensional diagram, even with visual devices like color and texture,
cant handle anywhere near the number of dimensions of articulatory, kines-
thetic, and auditory information available to a speakerhearer Nicholas dia-
grams, in fact, deal only with the initial segment of each target and token, not
with the whole-word articulatory trajectory. To get beyond such limitations, we
need the idea of attractor in the kind of multidimensional hyperspace that we
have hinted at already. A model of word representation even one that focuses
on phonology also needs dimensions for syntax, semantics and context,
because these do exert some force on phonological organization, as studies of
language change show. Without semantics, we wouldnt have blends like
hone in and refudiate (many of which eventually become part of the standard
language; a startling number of words of English seem to have originated as
blends of pairs of similar words). For example, hone in wouldnt have formed
from hone down and home in if there hadnt been some semantic relationship
between the two concepts that brought them together in speakers minds. And
of course, without syntactic, semantic, and pragmatic information, our minds
couldnt gure out morphology and morphophonemics.
In the 3-Dlandscape metaphor, each basin, channel, or path in the landscape
is an attractor that things tend to fall into, with bigger attractors being able to
trap more of the minds activity. In a large N-dimensional representation space, a
stronger attractor, rather than being bigger, has more mass. So an attractor in an
N-dimensional space is like a planetary or stellar mass in the physical universe: it
is a small region with mass and a position in the N-dimensional space (N-space,
for short), and it exerts a gravitational pull, or in more modern terms, it warps
the spacetime around it. (Its a region, not a mathematical point, because our
sensory systems dont provide innitely detailed resolution.) The position infor-
mation tells howfar each attractor is fromthe others, and a swarmof very similar
tokens builds up an attractor region, producing, e.g., Kuhls perceptual magnet
effect.
VI. Frequency and variation
A. Evidence from this volume for different kinds of frequency effects
In a usage-based developmental model, then, an attractor may be a representa-
tion of an input form, an output form, a map from input to output, and probably
many other connectables. An attractors mass is increased in some way by
frequency. A barely known word or sound has little mass in other words, its
representation is weak. As it becomes more familiar, it acquires more mass and
more ability to warp the space around it. But how does frequency determine
attractor mass? and the frequency of which events? Is it from how often a word
is said? Heard? Thought of but not spoken? How do we add up the frequency of
each pronunciation variant? How do we weight the contributions of segments
vs. whole words vs. units of other sizes? How do we even count them? A full
answer will take years to develop. But we can think about what needs to be
worked on and what directions research might take. What do we know so far?
Most of the chapters in this volume have dealt explicitly with the effect of one
or more kinds of frequency of occurrence on the course of phonological
development. A major theme has been showing that in spite of general
(universal) tendencies, if a structure is frequent in the ambient language
(consonant clusters, CV monosyllables, two-syllable words, geminate conso-
nants, nal consonants . . .), it is likely to be learned earlier by children acquiring
that language than by children learning a language in which it is not so frequent.
But frequency is never the sole determiner of order of acquisition for example,
Szreders Grzenio, learning Polish, acquires medial clusters earlier than the
equally frequent initial clusters.
Otas chapter distinguishes among several different kinds of frequency and
their probable effects on phonological development, while also being careful
not to overstate the power of frequency to override articulatory difculty. He
focuses on the difference between the overall frequency of a structure in the
ambient language, which is the kind of frequency discussed in the preceding
paragraph, and the frequency of each particular word in the ambient language.
These two types of frequency may interact; as he points out (p. 416):
Lexical frequency may have an impact on the childs phonological system because
repeated exposure to exemplars of the target word leads to a better-specied mental
representation of the phonological information in the word. Under this interpretation, the
relevant type of frequency will be that of the input; that is, the frequency at which the
child hears each lexical item in the ambient language. Alternatively, children may
become more accurate in their production of particular words as they gain experience
in articulating them. The relevant type of frequency for this hypothesis will be that of the
output, or the frequency with which the child attempts to produce each lexical item.
And on p. 417, after reviewing several papers on the order of acquisition of
syllable types: These ndings suggest that more frequently encountered pho-
nological structures (say, for example, coda /t/) are acquired earlier than less
frequently encountered ones (e.g., coda /b/), but they do not necessarily show
that a structure is acquired earlier in more frequently encountered lexical items
(e.g., /t/ in cat) than in less frequently encountered ones (e.g., /t/ in hat). From
his own research on Japanese childrens early syllable omission, however,
Ota is able to conclude (p. 429) that in fact within each word structure type,
changes affect individual lexical items in a systematic way according to their
input frequencies. Such frequency effects disappear once a word structure
becomes readily available to the learner and no longer induces syllable
omission.
This story makes it clear that input and output frequency probably have
different roles to play in phonological development, and that token frequency
and type frequency also have somewhat different effects. The input representa-
tion of a word becomes entrenched individually based on its input frequency
(other things being equal), but a new structure becomes readily available
because of its total output frequency across a number of different words. This
should apply to templates/canonical forms as well, since they are certainly
structures. As Ota says, hearing a word builds up the representation of the
way it sounds (and, we would add, of the circumstances in which the adult
produces it). Saying a word also produces a complex body of sensory informa-
tion that contributes to the words motor, kinesthetic, and proprioceptive repre-
sentation, plus the childs auditory representation of how she herself sounds
when she says it, as well as her intentions and other semantic/pragmatic
information. If the child is imitating or being imitated by the adult, even more
elaborate linkages of information will be created.
Towards the end of his discussion, Ota turns again to the work of other
researchers and raises some problems; one is the often-reported conservatism of
early-acquired words, some of which persist as phonological idioms, sometimes
indenitely, even after a child starts to use new rules/mapping patterns that
could apply to them.
B. Latent output representations a problem for testing
the Linked-Attractor model
If all we had to deal with was an input representation of a word consisting of an
accumulation of the auditory forms the child has heard and an output represen-
tation consisting of auditory kinesthetic, and motor memories of all the times
she has said it, then as we said earlier, the Linked-Attractor model would make a
clear prediction: higher input frequency should lead to more accurate represen-
tation of the sound of a word and predict the use of more accurate rules (as
Ota says). In contrast, higher output frequency should lead to entrenched
kinesthetic and motor patterns of the output form. So higher output frequency
pulls in the opposite direction from higher input frequency. In the absence of
higher input frequency, higher output frequency for a word would predict that
the child would maintain her old way of saying it.
Unfortunately, some odd data from Menn (1976) show that it may be difcult
to test this prediction, at least with children who are selecters/avoiders (rather
than being adapters). It looks as though selecter/avoider children may form
an output representation of a word that they are never heard to say a latent
output representation. Here is the case in point: Menns (1976) Jacob, who was
tape-recorded three days a week from 12 to 22 months, was late in developing
labials. Bye-bye was in his receptive vocabulary early, as it usually is for
children acquiring English, and he had begun to imitate it as [dada] at 13.25,
but he avoided all other /b/-initial words for another ve months until about
18.2 except for occasional transient attempts in which he either assimilated or
omitted the labial. Finally, at 19.9, shortly after he had been observed whisper-
ing apple to himself as [p], Jacob attempted box, rst as [da], and then with his
rst correct initial /b/. In the next several days, other /b/-initial words that were
long-established in his receptive vocabulary each also showed rst attempts
with #[d], swiftly corrected to #/b/ but not the word ball, which was produced
as [da] for several weeks, even after bye-bye had switched to a correct form.
1
It seems, then, that Jacob had formed some kind of latent form for ball with
an onset #[d], a form that he either inhibited completely or didnt say where
anyone could hear it, and that after he learned that he could at last say /b/, he
forgot to listen to and correct this deeply entrenched form. In order to account
for the new transient d-initial productions of the other old b-words like box,
we would have to say that he still had an entrenched #/b/-to-#[d] rule that was
on its way out, and/or that he also had latent d-initial output forms for those
familiar words, but not so deeply entrenched as the very important word ball.
(If you have lived in an apartment with a lively toddler who likes to throw
things, you knowhow often one says Thats not to throw; get the ball, balls are
to throw, wheres the ball?)
Because latent and therefore unobservable output representations like Jacobs
[da] for ball may exist in other avoider children, we have a nasty loophole in
an otherwise testable model. Our claim that the most conservative forms are the
ones that have become most deeply entrenched because their output forms are
the most frequent will have to allow for the possibility that there are unobserv-
able latent output forms, which of course cant be counted. So, unfortunately,
the prediction that the output forms that resist updating are the ones that are most
common in a childs output (if input frequency is held constant) can only be
tested for children who do a lot of adapting and very little avoiding.
C. How the Linked-Attractor model accounts for variation and change
What good is this elephantine Linked-Attractor conceptual model? Its too
difcult to use for making detailed predictions, at least until it can be modeled
on a computer. But we think it nevertheless serves a number of purposes. First, it
gives us a way to think about representations of words, sound patterns, and
sounds as multipart, multimodal (i.e., involving sound, movement, touch and
meaning, usage, syntax) psychological objects that grow and change over time.
Second, it extends the attractor metaphor used by Vihman and Croft (and its
relatives as used by other authors) so that we have a unied way of thinking
about how input representations, output representations, and even the rules/
mappings that connect them, are objects that are affected by one another, the
strong ones pulling the weaker ones in. Third, it helps lay out research programs
to try to determine more precisely how representations change and interact,
reminding us that even though we may focus our work on adult input word
frequency and child output word frequency because they are the most easily
observed data, there are other relevant factors that we can at least list and
perhaps think about. In particular, we can always approximate the full creature
by working with part of it at a time either with the smaller parts we are used
to or with the multipart but still simplied Nicholas diagrams while remem-
bering that they are in fact only parts of our elephant.
The Linked-Attractor model is also helpful in thinking about variability.
Recall that we set out four levels of variability that a theory of phonological
development needs to account for: 1. token-to-token production variation across
attempts at a given word; 2. lexical variation in whether/which rules or patterns
apply to a particular word as compared to others in a given childs vocabulary;
3. variation in mapping patterns and preferred output forms across children
within a given language; and 4. cross-language variation in mapping patterns
and preferred output forms. Other chapters in this book have explored the
third and fourth kinds of variation, and have discussed how they support
exemplar models, and therefore (at least some aspects of) ours. So we will
focus on how the Linked-Attractor model is helpful in dealing with the rst two
types of variation: token-to-token variation and lexical variation.
Lets consider a childs token-to-token variation in the production of a given
word. Weve already divided this into two subcategories. The rst is the messy,
random-looking variation that is characteristic of words that have new sounds
or complex sound patterns (an L2 analogue to this is the way in which English
learners ounder among many variants of Russian complex consonant clusters
containing trilled /r/, such as the onset of zdravstvuite hello, or how Japanese
learners of English struggle with words that have several instances of both /l/
and /r/, like library). The second subcategory is the much tighter type of
variation shown by words that vacillate between two well-dened output
targets commonly, between older and newer variants of the same word, but
sometimes between output forms that coexist and compete over an extended
period of time.
Any usage-based model has several basic ways to account for output variation
(in skilled as well as in unskilled speakers). In some cases, the best metaphor
would be the effect of the scatter of the previous outputs, which, as Pierrehumbert
(2002) says, cumulatively build up a somewhat larger target region (think of the
tight scatter of bullet holes around the center of a target that would be produced
by a skilled marksman vs. the looser scatter that would be produced by a novice).
When a native-speaker adult says a word the same way multiple times that is,
without discernable sources of variation in mood, emphasis, speech rate, or
whom it is addressed to there is a residual level of measurable variation in
the output sounds, even if it is too slight to be transcribable by trained listeners.
This range of variation denes the target pronunciations for the allophones in the
word in the particular style, mood, accent, etc. that the speaker is using.
The poor motor control of unskilled speakers means that they frequently land
outside what the language considers the appropriate pronunciation they have a
larger region of scatter. And if the random shots are all pulled towards some
other attractor, that region is not symmetrical around the original target. So the
scatter around a target could build up an incorrect attractor off to the side (so to
speak). This seems a likely source for extremely common processes like sub-
stituting stops for fricatives and devoicing word-nal voiced obstruents.
Bimodal variation, such as Dannys boot > [dut]-or-[bup] rule-conict pat-
tern described in Menn (1983), Cs template-lling variations like monster >
[majs]-or-[mjan] in Priestly (1977), or between earlier and later variants of a
word, like Danny producing tub either as [bb] or [tb] (Menn 1971: 247),
would occur when an object (word, sound, sound pattern, rule . . .) is pulled by
two different attractors at the same time. As sometimes happens to masses in the
real universe, the pulls are nearly balanced. Being able to account for competing
responses is an essential property of the attractor idea, and it has very general
application, because behavioral models in general need to account for competi-
tion among possible responses (do I go straight or turn right?). Closer to home,
psycholinguistics has long used increased reaction times as an index of cogni-
tive load when a speaker or hearer has near-balanced choices to make (e.g.,
Swinney 1979).
Another good feature of usage-based attractor models and their weaker-or-
stronger/multi-aspect representations is that they allow for the right kind of
unobservable events. Think of how an undersea volcano slowly becomes
larger until it breaks the surface, and how it may alter wave patterns just before
it does so. Now consider a child who is in the gravitational grip of a strong
rule for example, an assimilation rule that insures consonant place harmony.
Suppose she has been saying [gak] for sock, and therefore hearing herself
say [gak], while everyone else is saying [sak] for those oppy things she likes
to pull off her feet. Why does it take her so long to shift her pronunciation, why
is it so hard for her to even imitate the correct pronunciation, why is she likely
to be able to nally imitate it shortly before she starts to use the correct version
on her own?
Here we get to the payoff from the Linked-Attractor models ability to
simultaneously represent input attractors, rule/mapping pattern attractors, and
output attractors. Lets work through the case where our hypothetical child can
say [s] as long as there is no other obstruent in the target word that is, she can
say words like see and Sue correctly. (The story is interestingly different in the
cases where she has no output [s] or when she has [s] restricted to word-nal
position, but we will leave those as exercises for the reader.)
When this child hears and understands an adult saying /sak/, it will activate
her input and output exemplars of the form of the word /sak/ and her mappings
between those inputs and outputs. Simultaneously, it will activate her auditory
exemplar of the [s] sound at word onset, the [a] word-medially, and the [k]
word-nally. The activation will then spread to everything linked to those
representations; the stronger the link, the greater the transmitted activation.
(Activation will also spread from #/s/ to the auditory exemplars of all the
s-initial words that she knows, including /sak/ as well as /si/ and /su/.)
Since she has said the [s] correctly in other words and must therefore have an
output exemplar of it, activation can spread from the auditory exemplar of #/s/
to its output exemplar, which well call [s]. (Weve switched from / / to [ ]
brackets because we are now talking about the childs output representations.)
This output exemplar for [s] is not just articulatory-motor; it includes motor,
proprioceptive, kinesthetic, and haptic (touch) memories. Activation also
spreads further: to other input and output words with s, to her map from /s/ to
[s], to her existing output representation [gak], and to her map from /sak/ to
[gak]. (Recall that exemplar theory necessarily includes whole-word input >
output maps, whether or not a speaker also has segmental maps, because any-
thing that we do starts to establish a linkage, and anything that we do a lot has its
own exemplar or set of them.) The childs existing output representation [gak]
also activates a constructed (feed-forward) auditory representation, /gak/,
because of all the times that thinking of [gak] has preceded actually saying it
and hearing herself say it. Finally, saying the word generates real-feedback
auditory and articulatory [gak]. (And maybe weve missed enumerating some
other activated gocks and socks and /g/s and /s/s; try drawing a diagramof all the
links weve listed and see what you would add.)
So the child has a fair amount of simultaneous activation of incompatible
alternatives [sak] and [gak], #[s] and #[g] in both her input and her output
representations. This means that each one sends out inhibitory signals towards
the other (this is standard neural net modeling), and these will compete with
each other. If she says the word, one of the alternatives, either [gak] or [sak],
has to win but they are both active. Now we can see why it takes so long to
change from an old way of saying a word to a new one, why an imitation may
or may not be more like the adult model, and what is likely to be going on
before the observable shifts in imitative and then in spontaneous productions
of a word.
Lets work through some of these cases, at least sketchily; if you followall the
logic, youll probably nd additional steps and considerations. Warning: its
still very unclear what relative importance to assign to the representations and
inputoutput mapping of the whole word versus the representations and input
output mapping of the problem segment in our example, the initial #/s/. We
have already assumed that our child can say #/s/-initial words as long as they
dont contain any other obstruent; the account here also depends on having a
reasonably strong articulatory representation of #/s/ as a segment.
First, why does it take the child so long to let go of her old pronunciation of
the familiar word sock? Because every time she says it the old way, her
mapping from her input representation /sak/ to her old output articulatory
representation [gak] gets strengthened, and so does the output representation
[gak] itself, even though an internal negative (youre doing it wrong)
auditory feedback signal must be generated if she hears her own initial #[g]
at a time when her adult-speech-based auditory representation with initial
#/s/ is also active. (We dont know how often or under what conditions those
negative signals occur.) When this negative internal feedback occurs, it
quietly eats away at the strength of the erroneous /sak/ to [gak] mapping
and/or the output [gak] representation.
For the child to actually start saying [sak], this nagging negative feedback
(the #[g] of [gak] is wrong) has to be coupled with information about what to
say instead of that #[g]. (Otherwise, the effect of the negative feedback would
probably be avoidance of the attempts at the word like Jacob in the ball
story, above.) We assumed that our hypothetical child has #[s] in some other
words; indeed, overall, our model predicts that the pronunciation of an estab-
lished word is very unlikely to improve until any needed sounds have been
produced in some other word. Practice in saying the correct initial #s in those
other words strengthens the mapping between her auditory representation of #s
and its articulatory representation. This will make the nagging youre doing
it wrong feedback from saying [gak] stronger, and also permit an unspoken
articulatory representation as [sak] to start forming, although it will be too weak
to compete successfully with [gak] for a while.
How does imitation affect pronunciation in the Linked-Attractor model?
Consider the diary observation (Menn 1971: 247) of what happened over
time as LM tried to correct Dannys pronunciation of the word tub, which
was for several months invariably produced as [bb]. At rst, correcting
[bb] to /tb/ seemed unhearable it produced no observable effect. Later
in his development, Danny could imitate /tb/ correctly when he had just
heard it, but the new form seemed forgotten immediately. Still later, the
correct form occasionally appeared in his spontaneous speech; if the old form
did appear during this period and LM corrected it, Danny imitated the correct
form with effort. Those three stages are probably fairly general. (There was
also a transition [t bb] between the no-effect stage and the transient
correction stage, in which the correct sound was tacked onto the beginning
of the word.)
Suppose that hearing a word that you want to say temporarily boosts the
strength of its auditory representation this seems a reasonable rst-pass way of
modeling how repetition helps. If you already have a weak mapping from the
auditory representation to an articulatory representation, the extra activation of
your auditory representation will ow to that articulatory representation (prob-
ably just a little if you are not planning on repeating the word, more if you intend
to say it). That boost to the correct articulatory representation will also send
inhibition to the incorrect one, and it may tip the balance between the old form
and the new form. (Reect on your own L2 learning experiences and see if this
makes sense.)
D. Model summary
Rules and constraints are useful ways of dealing with well-behaved data, but
the Linked-Attractor model allows rules and constraints to be seen as part of a
larger picture that also includes unruly data and attraction to templates. More
importantly, it offers hope of providing a psychologically plausible mechanism
for how rules and constraints change over time, and how exceptions to them
may arise and then get submerged in the general pattern.
Attractors are massive bodies or subspaces in auditoryarticulatory space,
the bigger (or thicker) the stronger. Highly marked forms are, conceptually,
areas that repel the formation of representations. They can be modeled as
(near-)empty peaks and ridges in a 3-D space; they are places that the
articulators nd difcult to get to, and perhaps they might even be sources
of repulsive force (dark energy?) in a hyperspace. Markedness constraints
can be represented as contour lines in the 3-D space; the stronger the
markedness constraint, the higher the ground. Violating fewer constraints
puts you lower in the landscape, and violating more of them means climbing
higher.
Faithfulness the pull to match the adult model is an essential part of a
developmental model, and it was missing from both the rule and the template
approaches. Optimality Theory provided the rst formal approach to faith-
fulness; in Optimality Theory / Harmonic Grammar faithfulness constraints
work against markedness constraints, making the childs auditory represen-
tation of her output come as close as possible to her auditory representation
of her input. The Linked-Attractor model still needs work to make sure that it
automatically creates a privileged attraction between child-speaker and
adult-speaker auditory representations of a given word that captures what
OT faithfulness does.
The Linked-Attractor model needs many more dimensions than any three-
dimensional visualization can represent. Articulatory and acoustic features each
need perhaps eight dimensions or so (one for each feature to be represented). The
dimension of time is also essential if we are going to represent the articulatory
trajectory of a word and we must represent that trajectory in order to capture
the phonotactic constraints that play such a strong role in child phonology. So a
full model needs a high-dimensional vector space; specically, a representation
space composed at least of strings of articulatory and acoustic features plus
time and thats not counting the dimensions for semantics, pragmatics, and
speaker identication. At the beginning of speech, reecting the individual
childs prelinguistic development, this vector space would have the conguration
of attractors and repellors that model the preferences of her articulatory/auditory
system as modied by her early experiences of listening to the ambient language
and babbling.
VII. Conclusion: the whole elephant
We can learn a lot by looking at rules, constraints, and attractors separately, but
using the Linked-Attractor model, we can begin to combine them to get a more
complete picture of phonological development. Its not as neat as assembling
several views of a house, or sagittal (vertical lengthwise) and coronal (vertical
crosswise) CT-scan slices of a brain. Rather, its like trying to combine different
kinds of brain images and diagrams fMRI, MRI, ERP, etc. which represent
different but overlapping kinds of information, with different degrees of tem-
poral and spatial resolution.
Similarly, the three theoretical approaches represent different but overlapping
kinds of information, at different levels of temporal and phonetic resolution. All
of them are just schematic compared to the real level of explanation the level
of patterns of neural activity (a point on which we emphatically agree with
Smolensky and Legendre 2006a, b) but were not even close to being able to
get data at that level, or to interpreting them if we had them. If you are still
unhappy with this degree of redundancy, remember that our minds are contin-
ually building many sorts of cross-modal shortcuts that are efcient in terms
of time but redundant in terms of representation. For example, you are probably
a skilled typist, and you have direct links from frequent words to hand and
nger movement patterns, even though you theoretically dont need them; the
evidence for this is the way that an unfamiliar word or a non-word slows down
your typing.
Actual instances of the forms of words spoken to the child and by the child
are the most micro data level that we can work with to date. An input attractor
for a word is formed by the accumulation and integration of its input forms;
an output attractor for the word is formed analogously, and they are linked (rst
as wholes, later as analyzed sequences of segments) by the inputoutput
connections that are gradually built up by each attempt to say that word. The
creation of the input and output exemplars is the rst level of abstraction away
from the individual instances. Those exemplars those interacting accumula-
tions of examples affect one another in a way that is reasonably well
represented by a space-warping metaphor like the ones already used in the
infant perception literature. Strong exemplars of input and output forms form
the bases of the childs auditory and articulatory templates, and they warp the
multidimensional space around them the way massive bodies in the universe
warp space-time.
Constraints, abstracting further away from the individual exemplars, rep-
resent the structure of that warped space in a way that is easy to talk about and
to compare across time and individuals. Rules are, roughly speaking, orthog-
onal to constraints: they mostly reect direct links between input and output
words, sound sequences, or speech sounds. We cant prove that they are a
necessary type of representation, but that is no longer the point; they are
useful, especially because one of the things that they do better than any other
type of representation is to clarify the differences between ruly (segmen-
tally organized, stable) correspondences and unruly (whole-word and/or
unstable) inputoutput correspondences.
Currently missing from the Linked-Attractor model is any treatment of
morphophonemic alternation that is, what the speaker comes to know about
the systematic relationships between surface forms. This limitation will even-
tually be serious; we need to gure out how to grow an abstract extension to
this audition-and-articulation-based conception. The earliest published version
of the two-lexicon model (Kiparsky and Menn 1977) had a box labeled under-
lying representation hypothesized by the child above the box that we would
now call auditory representation. Within a generative phonological theory,
the idea is that as the child learns about allomorphic relationships (from sets of
words like, e.g., native, nation, national, nativity) the underlying representa-
tions of words would become more and more abstract, and more and more
elaborate rules would be needed to link that representation to the auditory
representation. In a language with richer and more frequently instantiated
morphophonemic variation, all speakers would eventually build representations
with a relatively high degree of abstraction.
The two-lexicon model also represented a maturing output system by grad-
ually modifying the selection rules: more of the information in the auditory
input representation (for example, the place of articulation of all of the stops in a
word instead of just one of them) is selected instead of being discarded, and that
information is preserved more accurately over time (for example, [r] is produced
with palatal approximation as well as with lip-rounding). The Linked-Attractor
model still needs to be made able to represent both of these kinds of matura-
tional changes.
Appendix
Ellies extreme unruly mappings from adult models to output from her rst 500
target-output lexical pairs (data from Menn et al. 2009; the number after the
input word is the order in which the inputoutput pair was attested, so that you
can see how impossible it is to account for the variability in mapping in terms of
order of production).
Cited items 26 to 469 attested at ages 2;0.16 to 3;3.17 (see p. 497).
note
1. This story is a clear case of a child failing to make automatic across-the-board
corrections to an earlier way of rendering a sound. Editor MMV reminds us that
Vihman (1982) gives another: after her subject V. stopped uniformly producing all
adult instances of // with [f], she needed a year to sort out which adult words actually
had // and which had /f/; during that time, she produced hypercorrections like [wai]
for wife.
A. Varying treatments of initial consonants
sit down26 > [s dam]
vs. sock60 > [dat]
B. Varying repairs to violations of consonant harmony
drink40 > [di] (deletion)
vs. truck47 > [gk], yuck48> [gk], junk64 > [gk] (regressive
velar harmony)
vs. duck57 > [dt], dog253 > [dat] (progressive alveolar harmony or
velar fronting)
cup68 > [bp] (after earlier avoidance of this word, regressive labial
harmony)
C. Nonharmonic treatments of nal consonants, varying within
consonant and within natural class
/t#/ hat95 > [t]
vs. out211 > [a]
/s#/ juice132 > [dus], this184 > [ds], kiss307 > [bis]
vs. house244 > [], this245 > [d]
/#/ sh320 > [bis], push212 > [bs] (compare treatment of other nal
sibilants, other nal fricatives)
/k#/ black238 > [b]
vs. milk228 > [mot]
/v#/ move207 > [mu], off247 > [a] (compare treatment of other nal
fricatives)
/#/ teeth233 > [tis] (compare treatment of other nal fricatives)
/z#/ scissors218 > [ds] or [dsz]
vs. Cheerios202 > [dijos]
vs. has215 > [], please248 > [pi]
vs. mittens255 > [mnz], these274 > [diz]
/t#/ watch192 > [wa], ouch225 > [a], which248 > []
vs. ouch213 > [as], watch431 > [was], church450 > [ds],
couch465 > [kas]
/nd#/ orange204 > [jnt] (compare forms for /nt/, below)
/nt#/ ranch284 > [w
s]
vs. bunch362 > [bnts], lunch469 > [nts]
References
Au, T. K., Knightly, L. M., Jun, S.-A., and Oh, J. S. (2002). Overhearing a language
during childhood. Psychological Science, 13(3), 23843.
Beckman, M. E. and Edwards, J. (2000). The ontogeny of phonological categories and the
primacy of lexical learning in linguistic development. Child Development, 71, 2409.
Beckman, M. E., Munson, B., and Edwards, J. (2007). Vocabulary growth and devel-
opmental expansion of types of phonological knowledge. In J. Cole and J. I. Hualde
(eds.), Laboratory phonology 9, pp. 24164, Berlin: Mouton de Gruyter.
Bernhard, B. H. and Stemberger J. P. (1998). Handbook of phonological development.
San Diego: Academic Press.
Boyland, J. T. (2009). Usage-based models of language. In D. Eddington (ed.),
Experimental and quantitative linguistics, pp. 351419. Munich: Lincom.
Browman, C. P. (1978). Tip of the tongue and slip of the ear: a comparative study.
Journal of the Acoustical Society of America, 64(S1), S93.
Browman, C. P. and Goldstein, L. (1992). Articulatory phonology: an overview,
Brown, R. W. and Hildum, D. C. (1956). Expectancy and the perception of syllables.
Bybee, J. (2001). Phonology and language use. Cambridge University Press.
(2006). Fromusage to grammar: the minds response to repetition. Language, 82, 71133.
(2010). Language, usage and cognition. Cambridge University Press.
Row.
Coleman, J. and Pierrehumbert, J. (1997). Stochastic phonological grammars and accept-
ability. 3rd Meeting of the ACL Special Interest Group in Computational
Phonology: Proceedings of the Workshop, 12 July 1997. Association for
Computational Linguistics, Somerset, NJ, pp. 4956.
Cutler, A. (2012). Native listening: language experience and the recognition of spoken
words. Cambridge, MA: MIT Press.
Davidson, L., Jusczyk, P., and Smolensky, P. (2006). Optimality in language acquisition I: the
initial and nal states of the phonological grammar. In P. Smolensky and G. Legendre
(eds.), The harmonic mind, vol 2, pp. 23178. Cambridge MA: MIT Press.
Edwards, J., Beckman, M. E., and Munson, B. (2004). The interaction between vocabu-
lary size and phonotactic probability effects on childrens production accuracy
and uency in nonword repetition. Journal of Speech, Language, and Hearing
Research, 47, 42136.
Ferguson, C. A. and Farwell C. B. (1975). Words and sounds in early language acquis-
ition. Language, 51, 419439. Reprinted in WilliamS-Y. Wang (ed.), The lexicon in
phonological change. The Hague: Mouton, 1977. Reprinted in this volume as
Chapter 4.
Fikkert, P. and Levelt, C. (2006). Howdoes Place fall into place? The lexicon and emergent
constraints in childrens developing phonological grammar. Unpublished MS.
Gerken, L. A. (2009). Acquiring linguistic structure. In E. Hoff and M. Shatz (eds.),
Handbook of language development, pp. 17390. New York: Blackwell.
Grunwell, P. (1981). The nature of phonological disability in children. London:
Academic Press.
Halle, M. (1971). The sound pattern of Russian: a linguistic and acoustical investigation.
Berlin: Walter de Gruyter.
Hebb, D. O. (1949). The organization of behavior. New York: John Wiley & Sons.
Hume, E. (2008). Markedness and the language user. Phonological Studies, 11, 8398.
4964.
Johnson, K. (1997). Speech perception without speaker normalization. In K. Johnson
and J. W. Mullenix (eds.), Talker variability in speech processing, pp. 14565. San
Diego: Academic Press.
(2006). Resonance in an exemplar-based lexicon: the emergence of social identity and
phonology. Journal of Phonetics, 34, 48599.
Juszcyk, P. (1993). Fromgeneral to language-specic capacities: the WRAPSAmodel of
how speech perception develops. Journal of Phonetics, 21, 328.
1997. The discovery of spoken language. Cambridge, MA: MIT Press.
Kager, R. 1999. Optimality Theory. Cambridge University Press.
Kager, R., Pater, J., and Zonneveld W. (2004). Constraints in phonological acquisition.
Kaisse, E. and Shaw, P. A. (1985). On the theory of lexical phonology. Phonology
Yearbook I, 130.
Kiparsky, P. and Menn, L. (1977). On the acquisition of phonology. In John Macnamara (ed.),
Language learning and thought, pp. 4778. New York: Academic Press. Reprinted in
G. Ioup and S. H. Weinberger (eds.), Interlanguage phonology: the acquisition of a
second language sound system, pp. 2352. Cambridge, MA: Newbury House, 1987.
Knightly, L. M., Jun, S.-A., Oh, J. S., and Au, T. K.-f. (2003). Production benets of
childhood overhearing. Journal of the Acoustical Society of America, 114(1),
46574.
Kuhl, P. (1991). Human adults and human infants show a perceptual magnet effect for
the prototypes of speech categories, monkeys do not. Perception and Psychophysics,
20(2), 93107.
(2007). Is speech learning gated by the social brain? Developmental Science, 10,
11020.
Kuhl, P. K., Williams, K. A., Lacerda, F., Stevens, K. N., and Lindblom, B. (1992).
Linguistic experience alters phonetic perception in infants by 6 months of age.
Science, 255, 6068.
Labov, W. (1981). Resolving the neogrammarian controversy. Language, 57(2), 267308.
Labov, W. and Labov, T. (1978). The phonetics of cat and mama. Language, 54(4),
81652.
Leopold, W. F. (193949). Speech development of a bilingual child (4 vols.). Evanston,
IL: Northwestern University Press.
Legendre, G., Hagstrom P., Vainikka, A., and Todorovna, M. (2006). Optimality in
language acquisition II: inection in early French syntax. In P. Smolensky and
G. Legendre (eds.), The harmonic mind, vol. 2, pp. 276306. Cambridge, MA: MIT
Press.
Lindblom, B., Diehl R., Park, S.-H., and Salvi, G. (2011). Sound systems are shaped by
their users. In G. N. Clements and R. Ridouane (eds.), Where do phonological
features come from?. Amsterdam: John Benjamins. pp. 6797.
(1979). Developmental reorganization of phonology: a hierarchy of basic units of
acquisition. Lingua, 49, 1149. Reprinted in this volume as Chapter 5.
Maye, J., Werker, J., and Gerken, L. (2002). Infant sensitivity to distributional informa-
tion can affect phonetic discrimination. Cognition, 82, 10111.
McCune, L. (This volume). Aview from developmental psychology.
McMurray, B., Cole, J., and Munson, C. (2011). Features as an emergent product of
perceptual parsing: evidence from V-to-V coarticulation. In G. N. Clements and
R. Ridouane (eds.), Where do phonological features come from? Cognitive,
physical and developmental bases of distinctive speech categories, pp. 197236.
Amsterdam: John Benjamins.
Mnard, L., Schwartz, J.-L., Bo, J.-L., Aubin, J. (2007). Articulatoryacoustic relation-
ships during vocal tract growth for French vowels: analysis of real data and
simulations with an articulatory model. Journal of Phonetics, 35, 119.
(1976). Pattern, control, and contrast in beginning speech: a case study in the develop-
ment of word form and word function. PhD dissertation, University of Illinois,
Urbana; published, Bloomington: Indiana University Linguistic Club, 1979.
(1983). Development of articulatory, phonetic and phonological capabilities. In
Menn, L. and Matthei, E. (1992). The two-lexicon model of child phonology: Looking
back, looking ahead. In C. A. Ferguson, L. Menn, and C. Stoel-Gammon (eds.),
Phonological development: models, research, implications, pp. 21147. Timonium,
MD: York Press.
Menn, L., Schmidt, E., and Nicholas, B. (2009). Conspiracy and sabotage in the
acquisition of phonology: dense data undermine existing theories, provide scaffold-
ing for a new one. Language Sciences, 31 (26), 285304.
Menn, L. and Vihman, M. M. (2011). Features in child phonology: inherent, emergent,
or artefacts of analysis? In G. N. Clements and R. Ridouane (eds.), Where do
phonological features come from? pp. 261301. Amsterdam: John Benjamins.
Messum, P. (2005). Learning to talk: a non-imitative account of the replication of
phonetics by child learners. CamLing 2005, 99106.
Moskowitz, B. A. (1970). The two-year-old stage in the acquisition of phonology.
Munson, B., Edwards, J., and Beckman, M. (2012). Phonological representations in
language acquisition: climbing the ladder of abstraction. In A. C. Cohn,
C. Fougeron, and M. K. Huffman (eds.), The Oxford handbook of laboratory phonol-
ogy, pp. 288309. Oxford University Press.
Oh, J. S., Jun, S.-A., Knightly, L. M., and Au, T. K. (2003). Holding on to childhood
language memory. Cognition, 86(3), B53B64.
Oliveira-Guimares, D. (This volume). Beyond early words: word template development
in Brazilian Portuguese.
Ota, M. (2006). Input frequency and word truncation in child Japanese: structural and
lexical effects. Language and Speech, 49, 26194.
(This volume). Lexical frequency effects on phonological development: the case of
word production in Japanese.
Ota, M. and Green, S. J. (2013). Input frequency and lexical variability in phonological
development: a survival analysis of word-initial cluster production. Journal of
Peperkamp, S. (2003). Phonological acquisition: recent attainments and new challenges.
Peters, A. M. and Menn, L. (1993). False starts and ller syllables: ways to learn
grammatical morphemes. Language, 69(4), 74277.
Pierrehumbert, J. (2002). Word-specic phonetics. In C. Gussenhoven and N. Warner
(eds.), Laboratory phonology 7, pp. 10139. Berlin: Mouton de Gruyter.
(2003). Phonetic diversity, statistical learning and acquisition of phonology. Language
and Speech, 46, 11554.
Priestly, T. M. S. (1977). One idiosyncratic stategy in the acquisition of phonology.
Pye, C., Ingram, D., and List, H. (1987). Acomparison of initial consonant acquisition in
English and Quich. In K. E. Nelson and A. Van Kleeck (eds.), Childrens language,
vol. 6, pp. 17590. Hillsdale, NJ: Lawrence Erlbaum.
Rice, K. (2007). Markedness in phonology. In Paul de Lacy (ed.), Cambridge handbook
of phonology, pp. 7998. Cambridge University Press.
Roy, D. (2011). www.ted.com/talks/deb_roy_the_birth_of_a_word.html.
Saffran, J. R., Aslin, R. N., and Newport, E. L. (1996). Statistical learning by 8-month-
old infants. Science, 274, 268.
Saffran, J. R. and Thiessen, E. D. (2003). Pattern induction by infant language learners.
Press.
Smolensky, P. and Legendre G. (2006a). Harmony optimization and the computational
architecture of the mind/brain. In P. Smolensky and G. Legendre (eds.), The
harmonic mind, vol. 1, pp. 362. Cambridge, MA: MIT Press.
(2006b). Principles of the integrated connectionist-symbolic cognitive architecture. In
P. Smolensky and G. Legendre (eds.), The harmonic mind, vol.1, pp. 63344.
Cambridge, MA: MIT Press.
Szreder, M. (This volume). The acquisition of consonant clusters in Polish: a case study.
Stoel-Gammon, C. (1989). Prespeech and early speech development of two late talkers.
(2011). Relationships between lexical and phonological development in young chil-
dren. Journal of Child Language, 38, 134.
Stoel-Gammon, C. and Dunn C. (1985). Normal and disordered phonology in children.
Austin, TX: Pro-Ed.
Storkel, H. (2001). Learning new words: phonotactic probability in language develop-
ment. Journal of Speech, Language and Hearing Research, 44(6), 132137.
Swinney, D. (1979). Lexical access during sentence comprehension: (re)consideration of
context effects. Journal of Verbal Learning and Verbal Behavior, 18, 64559.
Tokowicz, N. and MacWhinney, B. (2005). Implicit and explicit measures of sensitivity
to violations in second language grammar: an event-related potential investigation.
Studies in Second Language Acquisition, 27, 173204.
Velleman, S. L. and Vihman, M. M. (2006). Phonological development in infancy and
early childhood: implications for theories of language learning. In M. C. Pennington
(ed.), Phonology in context, pp. 2550. Luton: Macmillan.
Vihman, M. M. (1981). Phonology and the development of the lexicon: evidence from
childrens errors. Journal of Child Language, 8, 23964.
(1982). A note on childrens lexical representations. Journal of Child Language, 9,
24953.
(1996). Phonological development. Cambridge, MA: Blackwell.
Chapter 2.
Vihman, M. M., DePaolis, R. A., and Keren-Portnoy, T. (2009). A Dynamic Systems
approach to babbling and words. In E. Bavin (ed.), Handbook of child language,
Vihman, M. M., Kay, E., Boysson-Bardies, B., Durand, C., and Sundberg, U. (1994).
External sources of individual differences? A cross-linguistic analysis of the
phonetics of mothers speech to one-year-old children. Developmental
Psychology, 30(5), 65263.
ogy? Towards an integration of linguistic and psychological approaches, In
Singular Publishing. Reprinted in this volume as Chapter 9.
(2000). The construction of a rst phonology. Phonetica, 57, 25566.
Wauquier, S. and Yamaguchi, N. (This volume). Templates in French.
(1976). Perception and production in the acquisition of phonology. In W. von Rafer
Engel and Y. Lebrun (eds.), Baby talk and infant speech, pp. 294322. Amsterdam:
Swets & Zeitlinger. Reprinted in N. Waterson. Prosodic phonology: the theory and
its application to language acquisition and speech processing, pp. 678. Newcastle
upon Tyne: Grevatt & Grevatt, 1987.
Zamuner, T. S., Gerken, L., and Hammond, M. (2004). Phonotactic probabilities in
young childrens speech production. Journal of Child Language, 31, 51536.
References for reprinted papers
ition. Language, 51, 41939. Reprinted in W. S-Y. Wang, The lexicon in phono-
logical change. The Hague: Mouton, 1977.
units of acquisition. Lingua, 49, 1149.
Menn, L. (1983). Development of articulatory, phonetic, and phonological capabilities.
In B. Butterworth (ed.), Language production, vol. 2, pp. 350. London: Academic
Press.
templatic phonology. Linguistics, 45, 683725.
ogy? Towards an integration of linguistic and psychological approaches. In
Singular Publishing.
179211.
503
Index
Abdoh, 380, 404
Abondolo, 45, 51
abstract, 7, 17, 22, 414, 468, 88, 156, 169,
266, 280, 294, 325, 343, 446, 375, 446, 451,
462, 46970, 475, 496
abstraction, 4789, 495
accent, 320, 332
accentual arc, 325, 331, 333, 338
accentual pattern, 36, 338
accuracy, 2, 8, 24, 96, 1356, 143, 2001,
2089, 262, 269, 271, 2834, 291, 297, 299,
344, 3524, 357, 366, 380, 4035, 407, 417,
421, 424, 426, 4301, 442, 465
accurate, 3, 8, 22, 24, 26, 28, 96, 103, 1068,
158, 160, 174, 195, 197, 203, 244, 269, 280,
291, 296, 303, 312, 374, 376, 380, 384, 395,
398, 399, 4057, 416, 426, 428, 452, 4878
acoustic analysis, 238, 241, 256, 294, 296
acoustic signal, 6
across-the-board, 8, 239, 415, 431, 482, 496
activation, 484, 4912
adapt, 289, 405, 488
adaptation/s, 305, 329, 334, 3745, 3956,
3989, 403, 406
adapted, 23, 5, 23, 26, 29, 32, 36, 41, 244,
2967, 303, 305, 307, 327, 339, 351, 365,
376, 391, 3956, 3989, 404, 406, 442
adapting, 25
Adda, 319, 339
Adda-Decker, 319, 339
adult phonological representation, 312
adult phonology, 17, 21, 412, 44, 106, 110,
137, 140, 144, 207, 2678, 378, 381, 405,
4612, 482
affricate/s, 353, 356
affrication, 6578, 812, 352, 355
Akan, 43
Alarcos Llorach, 137, 165
Alcantara, 12
Ali, 380
Allen, 322, 339, 362, 36970, 418, 426, 432
allomorphy, 2056
allophonic rules, 205, 207
Almeida, 310, 313
Al-Tamimi, viii, ix, 3, 10, 344, 376, 378, 4089
alternation, 480
alveolar closure, 114
alveolarity, 6578
Amayreh, 380, 405, 408
ambient language/s, 1, 4, 235, 35, 401,
2602, 313, 31920, 375, 41617, 44950,
4534, 4656, 473, 4823, 487, 494
Ammar, 380, 381, 404, 408
analogy, 2, 1434
anchor syllable, 394
Anderson, 7, 11, 290, 456
Aoyama, 364, 370
Arabic, 3, 10, 3445, 37685, 390, 3945, 398,
403, 4057, 483
articulation, 3, 65, 72, 7783, 87, 89, 112, 138,
153, 1701, 178, 183, 186, 1956, 2089,
227, 252, 26970, 2801, 300, 313, 3456,
3534, 35660, 387, 460, 467, 476, 496
articulators, 195
articulatory challenge, 301, 311
articulatory control, 168, 260, 405, 407,
468, 473
articulatory difculty, 487
articulatory effort, 224
articulatory factors, 227
articulatory gestures, 46, 49, 186, 277, 343,
452, 470
articulatory habits, 204
articulatory motor programming, 169
articulatory patterns, 267, 360
Articulatory Phonology, 49, 3435, 479
articulatory program, 1978
articulatory representation, 4923
articulatory routine, 242
articulatory score, 49
articulatory skills, 346
Aslin, 6, 11, 13, 260, 286, 475, 501
assimilation/s, 19, 22, 80, 84, 86, 978, 112,
138, 145, 155, 157, 175, 1802, 189, 1912,
197, 207, 304, 352, 3546, 362, 366, 368,
412, 415, 462, 481, 491
504
attention, 23, 38, 196, 203, 245, 260, 271, 281,
284, 345, 357, 360, 363, 3745, 406
attraction, 494
attractor/s, 38, 472, 4747, 481, 4846, 489,
491, 4945
Au, 471, 498500
Aubin, 500
auditory coding, 6
auditory input representation, 496
auditory memory, 201, 476
auditory representation, 461, 488, 4924, 496
autosegmental, 8, 277, 285, 337
Autosegmental and Metrical Phonology, 8,
337, 462
avoidance, 108, 11011, 169, 178, 209, 224,
329, 334, 488, 493, 497
awareness, 169, 206, 238, 242, 245
babble/babbling, 2, 45, 24, 32, 40, 47, 95, 139,
16972, 204, 240, 259, 2612, 267, 270,
2778, 2835, 291, 310, 318, 320, 326, 374,
382, 384, 417, 441, 444, 452, 4767, 482,
494
baby talk, 163, 199, 381
backness, 6578, 86
Bailey, 48, 51
Baillargeon, 263, 286
Barton, 1756, 182, 2012, 210, 212, 239, 320,
339
basic unit, 47, 49, 133, 140, 161, 232,
338, 343
Bassano, 319, 335, 339
Bates, 240, 254, 262, 264, 286, 364, 370,
444, 456
Bauer, 24, 53, 266, 287
Beckman, 67, 1112, 22, 47, 512, 55, 417,
4323, 461, 465, 470, 498, 500
Beebe, 213
BEFs, see bisyllabic experimental forms
Bell, 181, 210
Bellugi, 61, 89, 91
Bengali, 3940
Benigni, 240, 254, 262, 286, 444, 456
Bennett, 213
Berg, 23, 51, 266, 284, 286, 415, 433
Berko, 114, 199, 206, 210, 212, 218, 233
Berko Gleason, 114, 199, 206
Berman, 25, 51
Bernhardt, 362, 370, 463, 483
Bertoncini, 12
Bhaya Nair, 39, 51, 376, 405, 408
bias, 176, 187, 202, 2689, 282
Bijeljac-Babic, 11
bilabial closure, 114
bilabiality, 6578
bilingual, 319, 334
binarity, 48
binary, 48, 3212, 324, 335, 338
binary foot, 3213, 329, 334, 338, 380
Binkofski, 456
Bisol, 293, 313
bisyllabic experimental forms (BEFs), 3,
21732, 237
bisyllabic ordinary forms (BOFs), 2, 217,
2245, 234
Blasdell, 231, 233
Bleile, 2389, 254
Bloom, 135, 165, 265, 286, 4434, 456
Bloomeld, 103, 114
Bo, 500
Boersma, 8, 11, 382, 408
BOFs, see bisyllabic ordinary forms
boldness, 284
Bond, 201, 211
Bonilha, 313
Bosch, 5, 12
Boudelaa, 377, 408
Boula de Mareil, 339
Bowen, 137, 166
Bower, 162, 165
Bowerman, 24, 51, 183, 239, 2534
Bowman, 55
Boyland, 460, 498
Boysson-Bardies, 5, 11, 24, 51, 56, 261, 2856,
320, 340, 417, 4325, 502
Braine, 2534
Branigan, 171, 210, 283, 286
Braud, 321, 326, 332, 3379
Brazilian Portuguese, 3, 2915, 298,
31011; see also Portuguese
Bretherton, 240, 254, 262, 286, 456
Broselow, 378, 403, 408
Browman, 7, 11, 49, 51, 343, 360, 472,
479, 498
Brown, 45, 11, 61, 879, 91, 472, 498
Brulard, 41, 52, 319, 334, 339
Bruskin, 264, 288
Buccino, 451, 456, 458
Buchtal, 454, 456
Buckley, 322, 324, 340
Bush, 94, 115, 134, 165, 241, 254
Butterworth, 210
Bybee, 18, 44, 479, 52, 31213, 460,
466, 498
Calvo, 52
Camaioni, 240, 254, 262, 286, 444, 456
Camarda, 458
Campos, 263, 289, 448, 458
canonical form/s, 105, 13940, 169, 192,
194200, 208, 463, 476, 481, 488
Cappa, 458
Carr, 41, 52, 319, 334, 339
Carroll, 61, 91
Carubbi, 52
Carvalho, 319, 339
Index 505
categorization, 7
category, 67, 426, 4950, 88, 354, 376
centrality, 6578
challenge, 23, 301, 310, 333, 346, 367, 408
Chambers, 340
Champion, 416, 433
Charette, 324, 339
Charles-Luce, 22, 56
Chen, C.-C., 416, 435
Chen, M., 114
Chiat, 23, 52, 266, 286
Childers, 363, 370
Chin, 362, 370
Choi, 24, 51
Chomsky, 7, 1011, 901, 113, 115, 343, 360,
4456, 456, 462, 464, 474, 498
Chomsky, 10
Church, 340
Clark, E. V., 2534
Clark, H. H., 17, 52
Clements, 7, 11
closed syllable, 297
closeness, 67
Clumeck, 172, 210
cluster insertion, 3512
clusters, 10, 44, 108, 148, 154, 157, 159, 179,
181, 193, 196, 199200, 207, 2312, 268,
294, 318, 334, 34452, 354, 3567, 35960,
362, 365, 369, 376, 381, 387, 408, 412,
41617, 431, 463, 467, 473, 479, 487,
490; see also consonant clusters
coda/s, 2934, 298301, 310, 319, 334, 3778,
3801, 390, 3945, 4045, 407, 417
code-switching, 384
cognitive development, 4, 87, 194, 204
cognitive load, 202
cognitive style, 1645
Cohen, 61, 91
coincidences, 219, 228
Cole, 460
Coleman, 8, 11, 465, 472, 498
combination, 307, 455
communicative grunts, 4535
competence, 1819, 80, 889, 113
competition, 34, 74, 812, 151, 155, 299,
31012, 480, 4912
complex realizations, 387
complex targets, 387
confusion, 148, 151, 156
connectionist model, 284
conscious, 4478
consciousness, 9, 202, 263, 446, 455
consolidation, 136, 189, 192, 312
consonant cluster, 40, 139, 144, 148, 158, 172,
199, 354, 357, 431, 481; see also clusters
consonant sequences, 367
consonant-nal words, 245, 248, 256
conspiracy, 1812, 184
constraint/s, 45, 78, 10, 24, 26, 289, 32,
356, 434, 48, 112, 13840, 153, 158,
1612, 1815, 193, 195, 239, 260, 267, 275,
31720, 3256, 330, 3339, 3435, 3567,
367, 406, 426, 4607, 4705, 4803, 4935
context-dependent words, 453
context-limited words, 2645, 285, 448
continuance, 6578, 813, 857
continuity, 7, 18, 31718, 326, 444, 4534
contrast, 21, 412, 99, 103, 1056, 108,
11013, 133, 13940, 143, 146, 148, 153,
1559, 163, 168, 176, 1823, 192, 208, 232,
259, 2716, 283, 2856, 311, 364, 431
co-occurrence, 13940, 144, 1589, 1623,
194, 367
Cooper, 24, 55, 238, 255, 291, 314, 362, 372
Corbetta, 452, 458
Corcoran, 57
core syllable, 365, 376, 380
coronal, 3526
correspondence, 98, 138, 140, 153, 208, 2202,
227, 229, 265, 426, 495
corresponding phones, 98
counter-stress, 3245, 332, 336, 375
Craighero, 448, 458
creativity, 144
Cristfaro-Silva, 292, 313
Croft, vii, ix, 1, 810, 13, 1718, 41, 4950, 52,
291, 315, 317, 342, 345, 361, 3745, 410,
4412, 459, 461, 467, 4702, 475, 481, 489,
501, 503
cross-language variation, 490
cross-linguistic, 5, 8, 23, 318, 374, 377, 442, 460
cross-modal, 477, 495
cross-modal mappings, 485
Crowhurst, 363, 370
Cruttenden, 105, 115, 162, 165
Cutler, 341, 363, 370, 478
CV phonology, 7
CV syllable, 310, 3289, 334, 338, 364, 404
CVCV phonology, 8
DOdorico, 20, 52
Dahalo, 42
Dalbour, 137, 165
Dale, 364, 370, 4323
Danna, 458
Davidson, 466, 472, 498
Davis, 24, 32, 52, 56, 260, 270, 286, 288,
294, 314
Dean, 362, 372
deautosegmentalization, 268
Declarative Phonology, 8
Del Giudice, 448, 456
Delery, 453, 457
deletion, 22, 97, 114, 1389, 147, 1512, 155,
1579, 17985, 307, 330, 347, 362, 376, 381,
4045, 462, 497
506 Index
Dell, 322, 324, 339
Demolin, 294, 313
Demuth, 322, 324, 335, 33940, 365, 370, 376,
409, 418, 433
den Os, 363, 372, 435
dental, 353
DePaolis, 6, 10, 1213, 20, 24, 52, 56, 342, 344,
361, 416, 433, 452, 4567
Dependency Phonology, 8
devoicing, 230
Di Cristo, 322, 3245, 331, 333, 338, 340
di Pellegrino, 448, 456
diary, 34, 1920, 401, 168, 294, 364, 4489,
480, 493
Diehl, 499
differentiation, 878, 168, 449
dimensions, 4856, 494
diminutive/s, 32, 332, 349
Dinnsen, 362, 370
discontinuity, 1819, 24
dissimilation, 80
distinctive features, 93, 111
distributional learning, 2
domain edge, 324
dominance, 3, 244, 248, 279
Donahue, 20, 52, 4556
dorsal, 354, 356
Dos Santos, 322, 340
Drachman, 207, 210, 224, 233
Druss, 340
Dunn, 501
Durand, C., 5, 11, 51, 56, 286, 340, 432, 435,
502
Durand, J., 8, 11
Dutch, 20, 319, 322, 334, 365, 417, 426, 483
dynamic systems, 441, 445, 451, 453, 455
Dyson, 380, 405, 408
ease of articulation, 169, 180, 185, 224
Echols, 320, 340, 363, 370, 418, 426, 433
Edwards, J., 67, 1112, 22, 478, 512, 417,
4323, 461, 465, 470, 497, 498, 5001
Edwards, M. L., 967, 115, 165, 218, 226, 233,
241, 254, 362, 370, 416, 435
effort, 80, 224
Egyptian, 380, 381
Eimas, 1, 11
Elbers, 20, 24, 52, 279, 286, 319, 340
Elbert, 13, 240, 255, 369, 372, 384, 410
elicitation, 218
elicited forms, 217
elision, 80
Elsen, 20, 267, 52
embodied cognition, 447
emergence of the unmarked, 311
emergent, 2, 5, 40, 47, 239, 25960, 268, 271,
273, 276, 27980, 282, 285, 320
emergentist, 1, 460
English, 3, 10, 201, 25, 2930, 32, 3544,
845, 92, 945, 1345, 139, 1589, 168,
1767, 1823, 1878, 190, 199, 2067, 218,
224, 226, 231, 238, 31823, 334, 355, 3626,
369, 3756, 3815, 3945, 3989, 4034,
406, 41719, 426, 454, 464, 467, 478, 4823,
486, 488, 490
Enright, 262, 289
entrenched, 312, 4735, 484
entrenched forms, 306
entrenchment, 32, 35, 312, 475, 484
epenthesis, 22
Ervin-Tripp, 231, 233
Esling, 454, 456
Estonian, 3, 20, 256, 2933, 356, 40, 55, 184,
31819, 334, 375, 377, 379, 483
Ewen, 434, 489, 52
exceptions, 32, 1578, 18994, 231, 281,
4234, 4634, 467, 474, 494
exemplar/s, 2, 69, 17, 41, 47, 50, 465, 46970,
487, 4912, 495
exemplar models, 490
experimental forms, 223, 228, 231; see also BEFs
experimentation, 152, 189, 2389, 242, 244,
24851, 271, 281, 390
extra-systemic, 106, 259, 269, 271, 274, 281, 284
Faaborg-Anderson, 454, 456
Fadiga, 447, 456, 458
Fagen, 262, 289
faithfulness, 494
faithfulness constraints, 473, 494
familiarity, 21, 29, 40, 202, 224, 454
Farwell, ix, 1, 7, 1011, 212, 24, 28, 47, 52,
93, 140, 162, 1645, 1778, 21011,
21718, 2245, 233, 238, 254, 259, 262, 266,
286, 291, 31113, 317, 340, 343, 360, 362,
372, 375, 409, 41516, 433, 4423, 450, 456,
468, 4845, 498, 503
favorite sounds, 178
Fazio, 458
featural, 7
feature geometry, 271, 276
Fee, 267, 286, 322, 339, 365, 370, 376, 409
feedback, 202, 4923
feet, 48, 97, 119, 126, 130, 3224, 338,
381; see also foot
Fennell, 57
Fenson, 364, 370
Ferguson, ix, 1, 78, 1011, 13, 212, 24, 28,
47, 52, 93, 104, 10810, 113, 115, 140, 159,
162, 1645, 172, 1778, 183, 211, 214,
21718, 2245, 233, 23840, 2545,
25960, 262, 266, 286, 288, 291, 31113,
317, 340, 343, 356, 360, 362, 369, 372, 375,
384, 40910, 41516, 433, 4423, 450, 452,
456, 468, 4845, 498, 500, 503
Fernald, 260, 287
Index 507
Fey, 183, 211, 2534
Fikkert, 8, 11, 2668, 287, 319, 3212, 333,
340, 343, 360, 363, 372, 376, 409, 418, 426,
433, 476, 498
ller, 135, 333
ller syllable/s, 404
lter, 40, 239, 260, 334
nal consonant/s, 445, 143, 157, 179, 2401,
245, 248, 250, 252
nal segments, 253
Finnish, 10, 20, 25, 36, 38, 40, 318, 3445, 357,
3639, 376, 377, 405, 4667, 483
rst word, 45
Firth, 49, 52
Firthian, 7
Fisher, C., 320, 340
Fisher, H., 226, 234
Flege, 182, 211
Fodor, 91
Fogassi, 447, 456, 458
Folger, 178, 211, 284, 287
Fonagy, 322, 3245, 340
foot, 246, 258, 303, 315, 3214, 328, 333, 335,
338, 381, 424, 437, 451, 485; see also feet
forcefully articulated, 88
fossilized forms, 305
Fougeron, 324, 341
Francescato, 19, 53, 98, 115, 317, 340
Fraser, 91
Frdonie, 12
Freitas, G. C. M., 313
Freitas, M. J., 340
French, 4, 20, 25, 36, 38, 401, 182, 283, 293,
31739, 364, 375, 378, 3816, 3945, 397,
399, 4034, 4067, 417, 463, 466, 483
French, A., 36, 53
frequency, 47, 31012, 376, 378, 390, 407,
41617, 421, 430, 460, 478, 4867
frequency effects, 466
frequency of occurrence, 21, 377, 398, 465, 487
frication, 148
fricative/s, 85, 353, 356
friction, 6578, 812, 856
Frisch, 48, 53
Fromkin, 152, 166
fronting, 190
frontness, 6578, 856
frozen forms, 153, 158, 163
Fry, 61, 63, 87, 91
functionalist, 9
functionality, 31920
Gallese, 447, 456, 458
Gamkrelidze, 287
Gandour, 183, 211, 2534
Garlock, 4323
Garnica, 967, 109, 115, 162, 165, 218, 233,
260, 287
geminate/s, 10, 36, 38, 40, 344, 3634, 3669,
376, 378, 381, 385, 387, 389, 3959, 4047,
419, 483, 487
geminate template, 405
gemination, 375
generalization, 312
generative, 8
Generative Phonology, 7, 50, 62, 462, 464,
466, 474
generative rules, 462
Gentilucci, 458
Gerken, 6, 11, 320, 322, 340, 363, 372, 417,
435, 465, 475, 483, 498, 499, 502
German, 20, 21, 26, 27, 40, 84, 85, 95, 182, 206,
323, 334, 483
Gierut, 416, 4324
Gimson, 91
Gleason see Berko Gleason
glide/s, 353, 354, 356
gliding, 355
glottality, 6578
Gnanadesikan, 311, 313, 343, 360
Goad, 285, 287, 322, 324, 340
Goffman, E., 454, 456
Goffman, L., 6, 11
Gogate, 453, 457
Goldeld, 260, 287
Goldinger, 47, 53
Goldsmith, 78, 11, 49, 53, 268, 287
Goldstein, L., 7, 11, 49, 51, 211, 343, 360, 479,
498
Goldstein, U., 186
Goodell, 23, 55, 343, 357, 361
Goodman, 4323
Government Phonology, 8
Grammont, 217, 233
Greek, 20, 207
Green, 324, 417, 431, 434, 468, 500
Greenberg, 110, 162, 166
Greenlee, 2401, 255
Grgoire, 61, 91
Grijzenhout, 340
Grunwell, 22, 53, 238, 254, 355, 360, 362, 365,
372, 483, 498
Guarani, 280
Gussman, 380, 409
Hagstrom, 499
Hahn, 48, 51
Halle, 7, 1011, 612, 91, 343, 360, 462, 464,
474, 498
Hall, 56, 11, 13, 51, 261, 286, 320,
340, 342
Halliday, 171, 172, 211
Hammond, 417, 435, 465, 502
Hamp, 199, 211
Harmonic Grammar, 462, 473, 494
harmonized, 334
508 Index
harmonizing, 29
harmony, 20, 22, 278, 303, 36, 389, 43,
1389, 147, 152, 161, 1801, 191, 193, 195,
197, 266, 267, 271, 274, 276, 283, 3045,
308, 318, 329, 348, 353, 359, 376, 394, 395,
4067, 412, 497
Harris, J., 380, 409
Harris, Z. S., 195, 214
Haselkorn, 172, 212
Haspelmath, 50, 53
Hawkins, 174, 207, 211, 322, 339, 362, 36970,
418, 426, 432
Hayes, 322, 340, 378, 380, 409
Hebb, 470, 498
Hebrew, 20, 25
Hegazi, 380, 410
Herold, 11
hiatus, 230, 232
Hickok, 447, 450, 457
high frequency, 47, 156, 252, 274, 294, 319,
334, 378, 387, 395, 41617
Hijazi, 380
Hildum, 472, 498
Hindi, 20, 3840, 345, 3767
Hirsh-Pasek, 320, 3401
Hoaglin, 249, 255
Hochberg, 240, 255
Hodson, 417, 433
Hogan, 363, 370, 415, 4323
Hhle, 5, 11
holistic, 6, 24, 137, 207, 259, 292, 3078, 311,
320
Holmes, 418, 433
homonyms, 61, 678, 75, 95, 224, 227, 467
Hood, 165
Hooper, 164, 166
Hooper-Bybee, see Bybee
household word, 219
Howell, 362, 372
Hsieh, 104, 115, 224, 233, 416, 433
Hulst, 523
Hume, 466, 485, 499
Hunt, 263, 289
Iakimova, 12
iambic, 38, 175, 3204, 333, 3378, 378, 3801
iambic foot, 322, 324, 3289, 333, 335,
338; see also foot/feet
idiomatic, 219, 225, 228, 473, 484
idiosyncratic, 2, 111, 113, 1767, 185, 189, 217,
220, 2246, 2389, 259, 268, 312, 3267,
329, 337, 441
imitation/s, 25, 63, 78, 96, 11422, 1356, 144,
1501, 1645, 2012, 204, 218, 245, 270,
283, 285, 363, 391, 3956, 399, 471, 476,
4923
implicit, 2, 5, 24, 401, 47, 232
independent system, 10, 612
individual, 477, 479
individual differences, 4, 8, 71, 10910, 133,
1601, 1634, 168, 260, 336, 374, 390, 403,
4067, 448, 482
induction, 40
information-processing model, 163
Ingram, D., 105, 108, 115, 138, 143, 162, 166,
174, 180, 182, 184, 187, 194, 211, 217, 227,
233, 238, 254, 362, 365, 370, 372, 443, 457,
4623, 483, 499, 501
Ingram, T. T. S., 61, 91
Inhelder, 451, 458
inhibition, 492
initiation-of-speech phenomenon, 135
innate, 8, 19, 40, 263, 268, 485
innatist, 263
input attractor/s, 491, 495
input frequency, 32, 31011, 364, 41618, 421,
426, 4302, 467, 478, 4889
input lexicon, 1979
input representation/s, 461, 471, 479, 4889,
492
input templates, 472, 4745
inputoutput mapping, 475, 477, 492
inputoutput relations, 463, 473
instability, 480
inter-modal representations, 484
internal negative feedback, 4923
internal representation, 263, 265, 271, 2845
Irwin, 94, 115
Isako, 42
Italian, 20, 318
item, 177, 219, 407
item-based phonology, 259
item-learning, 405
Itkonen, 211
Iverson, 268, 287
Jaeger, 32, 35, 53, 2067, 211
Jaffe, 213
Jakobson, 4, 12, 21, 24, 53, 612, 79, 89, 91, 98,
104, 106, 10810, 11415, 162, 166, 170,
178, 187, 193, 211, 224, 232, 234, 362, 370,
441, 457
Japanese, 20, 364, 41720, 424, 428, 432, 487,
490
jargon, 170, 384, 396
Jeffrey, 163, 165, 167
Jenkins, 162
Jensen, 231, 233
Johnson, C. E., 94, 115
Johnson, E. K., 1, 12, 341
Johnson, J. S., 370, 415, 418, 4324
Johnson, K., 460, 470, 472, 499
Johnson M., 447, 457
Jones, D., 65, 77, 91
Jones, L. G., 171, 211
Jordanian Arabic, 380
Index 509
Jun, 324, 341, 498, 499, 500
Jusczyk, 1, 11, 12, 40, 53, 238, 254, 320, 3401,
363, 370, 466, 472, 475, 483, 4989
Kager, 8, 12, 462, 499
Kaisse, 474, 499
Kaplan, 266, 290, 443, 445, 44850, 459
Karlsson, 363, 365, 369, 370
Katz, 901
Kavanagh, 113, 115, 214
Kay, 56, 432, 435, 502
Kay Elemetrics Corp., 249
Kaye, 8, 12
Keating, 182, 214
Kehoe, 363, 370, 418, 433
Keller, 17, 53
Kemler-Nelson, 3401
Kemmerer, 22, 56
Kennedy, 3401
Kenstowicz, 50, 53
Kent, 24, 53, 201, 211, 260, 266, 287, 447,
457, 480
Keren-Portnoy, ix, 2, 5, 10, 1213, 291, 313,
318, 341, 416, 430, 433, 452, 4567
Kessler, 55
Keyser, 8, 11
Keysers, 448, 456
Khanty, 456
Khattab, ix, 3, 10, 344, 376, 378, 4089
Kinney, 314
Kiparsky, 144, 160, 166, 262, 287, 461, 471,
496, 499
Kirchner, 454, 457
Kisseberth, 181, 211
Kiterman, 231, 234
Knewasser, 55
Knightly, 471, 498500
Koenigsknecht, 239, 256
Krgvee, 20, 26, 53
Korte, 201, 211
Kresheck, 226, 234
Krikhaar, 363, 372, 426, 435
Kubozono, 424, 433
Kuhl, 1, 12, 472, 475, 486, 4989
Kunnari, 10, 13, 20, 53, 291, 315, 318, 342,
36371
labial/s, 68, 3526
labialalveolar, 279
labiality, 6578, 813, 148, 151, 270, 367
labiodentality, 6578
Labov, T., 23, 54, 480, 499
Labov, W., 23, 42, 50, 534, 11415, 148, 166,
480, 499
laboratory phonology, 460
Lacerda, 499
Ladd, 55
Ladefoged, 412, 54, 61, 7980, 91, 2334
Lahey, 443, 456
Lamel, 339
Lamprecht, 294, 313
Landberg, 261, 289
Large, 53
lateral release, 701
laxness, 83
Lebanese, 374, 3789, 381, 399, 405, 407
Lebanese Arabic, 379
Legendre, 466, 495, 498, 501
lemma, 471
length, 3746, 378, 38990, 394, 404, 406
Lenneberg, 61, 901
Leonard, 178, 211, 213, 238, 254, 284, 287,
289, 416, 433
Leopold, 201, 54, 612, 79, 80, 846,
8997, 103, 11011, 11415, 170, 193, 202,
211, 239, 254, 269, 448, 457, 463,
481, 499
Lepage, 448, 457
Leslie, 263, 287
Levelt, C. C., 8, 11, 340, 343, 360, 417, 433,
476, 498
Levelt, W. J., 417
Lewis, C., 284, 288
Lewis, L. B., 97, 363, 370, 415, 4323
Lewis, M. M., 612, 80, 84, 92, 260, 284, 287,
448, 457
lexeme, 471
lexical accent, 322
lexical contrast, 109, 111
lexical diffusion, 312, 416, 418, 4312, 482
lexical entries, 476
lexical frequency, viii, 48, 41518, 421,
42632, 468, 487; see also frequency
lexical identity effects, 467, 468
lexical item/s, 294, 390, 416, 427, 487
lexical parameter, 1034, 108, 114, 224
Lexical Phonology, 8, 474
lexical smoothing, 1903
lexical stress, 3223
lexical unit, 56, 21, 318, 377
lexical variation, 8, 185, 415, 418, 482, 490
lexicon, 7, 20, 24, 32, 38, 40, 46, 47, 80, 88, 93,
104, 111, 1737, 198, 199, 23840, 279, 284,
285, 295, 31718, 330, 3645, 369, 3745,
377, 407, 416, 429, 4312, 466, 470, 476,
480, 482
Li, 432, 433
liaison, 323, 334, 339
Lieven, 54
Lightbown, 165
Lindblom, 276, 287, 499
Linell, 206, 212
Linked-Attractor model, 447, 4601, 465, 472,
474, 479, 4834, 48896
liquid/s, 86, 356
liquid feature, 6578
510 Index
List, 483, 501
Ljamina, 218, 234
Lle, 2668, 287, 313, 418, 433
Locke, 24, 54, 252, 254, 262, 265, 287
long consonants, 376
long-term memory, 201, 202; see also memory
Lowenstamm, 8, 12
low frequency, 416
Lucas, 262, 289; see also frequency
Lucchesi, 11
Luce, 22, 48, 56
Luckau, 165, 241, 254
ukaszewicz, 344, 347, 350, 360
Lundberg, 261, 289
Luppino, 458
Lyons, 92
Lyytinen, 364, 370
MacKain, 238, 254
Macken, vii, ix, xi, 1, 2, 3, 7, 10, 12, 13, 223,
256, 51, 54, 56, 1334, 147, 1626, 170,
1748, 182, 1846, 196, 200, 211, 212,
23941, 2546, 262, 2668, 28791, 311,
314, 317, 325, 337, 341, 342, 360, 362, 371,
375, 404, 406, 409, 415, 418, 431, 434, 441,
450, 457, 463, 467, 481, 499, 503
MacNeilage, 24, 32, 52, 260, 270, 286, 288,
310, 314
MacWhinney, 170, 175, 206, 21213, 326, 341,
418, 434, 471, 501
Maddieson, 42
Maekawa, 6, 13
Maillochon, 339
Majorano, 2, 12, 313, 341
malapropisms, 203
Malikouti-Drachman, 207, 210
Malkiel, 103, 115
Mandarin, 168, 372
Manera, 448, 456
manner, 29, 138, 158, 271, 273, 304, 318, 346,
352, 356
manner of articulation, 153
mapping/s, 4, 7, 8, 49, 193, 199, 208,
426, 4624, 4724, 477,
482, 4845, 48893, 496
marked, 232, 310
marked forms, 494
markedness, 8, 10, 36, 344, 463, 466, 473
markedness constraints, 494
Markey, 284, 288
Marshall, J. C., 92
Marshall, P. J., 448, 457
Marslen-Wilson, 377, 408
Marti, 320, 340
Massey, 182, 211
Matelli, 458
Matthei, 4, 7, 12, 20, 54, 2657, 284, 288, 415,
432, 434, 455, 457, 471, 474, 477, 482, 500
Mattingly, 113, 115
Mattoso-Camara, 2934, 314
maturation, 252, 263, 334, 405
Matyear, 314
Matzenauer, 313
Maxwell, 239, 255
Maye, 475, 499
McCarthy, 1, 8, 12, 50, 54, 267, 288, 311, 314,
3778, 380, 403, 409
McCawley, 206
McCune, ix, 2, 3, 9, 14, 23, 24, 54, 56, 240, 255,
2605, 277, 284, 288, 2901, 31415, 342,
382, 384, 410, 442, 445, 448, 4509, 469,
480, 5023
McCune-Nicolich, 288, 4445, 457
McCurry, 94, 115
McDonough, 267, 288
McMurray, 460
McNeil, 218, 234
McNeill, 90, 92
meaning, 4423, 44951
meansends, 2045
Medeiros, 294, 313
MEFs, see monosyllabic experimental forms
Mehler, 341
Melli, 456
melodic patterning, 318
melody, 22, 277, 280, 281, 327
Meltzoff, 285, 288, 448, 457
memories, 3, 4, 6, 23, 47, 62, 2012, 2056,
209, 2635, 312, 335, 358, 360, 455, 460,
469, 476
Menn, 1, 4, 79, 12, 1920, 234, 26,
54, 110, 115, 140, 144, 147, 152, 160, 162,
164, 166, 1702, 178, 180, 188, 189,
1912, 199, 203, 206, 210, 212, 214, 2389,
253, 255, 259, 262, 2657, 284, 2878,
291, 312, 314, 317, 337, 341, 362, 367, 371,
375, 385, 409, 415, 432, 434, 4448, 458,
4604, 46785, 488, 491, 493, 496,
499500, 503
mental representation, 259, 2636, 416, 441,
44455, 468
Menyuk, 176, 204, 212, 224, 2334, 238, 255
Mesalam, 238, 254
Messum, 473, 500
metalinguistic awareness, 178
metathesis, 22, 2735, 80, 84, 86, 144, 147,
148, 153, 159, 1645, 179, 1845, 219, 221,
230, 231, 267, 268, 348, 462
metrical structure, 17, 3204
Metsala, 4323
Mezzomo, 313
Mike, 232, 234
Milewski, 3467, 360
Miller, G. A., 92
Miller, J., 4, 13, 56, 240, 256, 262, 290, 318,
342
Index 511
Miller, R., 4, 13, 24, 56, 213, 2401,
256, 262, 270, 290, 295, 318, 342, 369,
372, 459
minimal pairs, 99, 105, 176, 320
minimal word, 380
Miranda, 294, 314
mirror neuron/s, 4478, 451
misperceptions, 1357, 163
Miyata, 418, 421, 434
Mohanan, 8, 12
monosyllabic experimental forms (MEFs),
21819, 2234, 228333, 237
Montez Giraldo, 165
Moore, 285, 288
mora, 268, 378
morphophonemic rule, 2056
morphophonological rules, 205
morphophonologies, 206
Morrisette, 416, 433, 434
Morsi, 380, 408
Moskowitz, 104, 106, 115, 1434, 153, 166,
170, 176, 193, 201, 212, 2334, 415, 432,
434, 467, 481, 500
motor, 263
motor control, 23, 209, 277, 480, 490
motor procedures, 442
motor sequences, 169
motor skill, 266
motoric maturation, 252
motorically accessible, 310
Mottet, 339
Mozer, 284, 288
Mulford, 262, 289
multidimensional hyperspace, 486
multi-modal, 489
Munson, B., 67, 1112, 52, 255, 417, 433,
434, 4601, 465, 46970, 472, 485,
498, 500
Munson, C., 460, 500
Murphy, 11
Myers, 267, 288
Myerson, 206, 212
Naeser, 212
Nakai, 6, 13, 24, 56, 320, 342, 344, 361, 364,
370, 452, 456
Nakazima, 176, 212
nasal, 353, 354, 356
nasalization, 355
nasality, 6578, 823, 86, 114, 151, 180, 195,
300, 356
Nasr, 378, 409
natural process, 182, 187, 193
Nazzi, 5, 1112, 320, 341
neighborhood density, 4312
neighborhoods, 22, 432
Nelson, 341, 444, 458, 501
Nespor, 321, 341
Nettelbladt, 365, 371
network, 9, 469
network models, 478
neural activation, 484
neurological bases of development, 447
neurological model, 450
neurological research, 447
neuromotor, 2, 5, 260, 451
neurophysiological, 446
Newhoff, 238, 254
Newport, 418, 426, 433, 475, 501
Nguyen, 339
Nicholas, ix, 7, 12, 2634, 284, 286, 288, 458,
461, 481, 484, 496, 500
Ninio, 430, 434
no onset, 36, 3840, 51
nonlinear, 8, 380, 443
nonlinear model, 49, 268
nonlinear phonology, 49, 266
nonlinear representations, 49
non-natural rules, 169
nonsegmental, 613
Nystrm, 448
object permanence, 451
Obrecht, 378, 409
ofine, 465
ofine representation, 464
Oh, 471, 498500
Ohala, D., 267, 289
Ohala, J., 41, 54
Ohnesorg, 612, 92, 225, 234
Okada, 432, 434
Oldeld, 92
Oliveira, C. C., 313
Oliveira, M. A., 293, 314
Oliveira-Guimares, ix, 3, 291, 295, 467, 480
Oller, 36, 55, 238, 255, 310, 314
Olmsted, 96, 110, 116
Otuszewski, 232, 234
omission/s, 289, 36, 38, 40, 98, 179, 268, 283,
3334, 345, 354, 3623, 366, 368, 375, 418,
419, 4201, 426, 429, 4312,
438, 487
omitted, 348
online, 4635
onomatopoeic words, 97
opening, 6578
openness, 6672, 88
opposition, 106, 232
Optimality Theory (OT), 8, 4623, 466,
473, 494
ordered rules, 230, 463
ordinary replacement forms (ORFs), 21819,
22330, 237
ordinary forms, 226
ORFs, see ordinary replacement forms
OT, see Optimality Theory
512 Index
Ota, ix, 7, 8, 9, 409, 41719, 424, 426, 431, 434,
466, 468, 473, 483, 4878, 494, 500
Otomo, 262, 289
output attractor/s, 491, 495
output constraints, 169, 193, 195
output forms, 227
output frequency, 4889
output lexical entries, 476
output lexicon, 1979, 203, 208,
463, 471
output pattern/s, 133, 157, 2445, 248,
252, 472
output representation/s, 471, 488, 489, 492
output template, 475
output variation, 490
output vocabulary, 197
overgeneralization, 169, 186, 189, 1913, 208
Paesov, 219, 234
Paden, 417, 433
Pakinam, 410
palatal, 3534
palatal pattern, 2778, 280, 282, 286
palatalization, 3523, 355
palatals, 352
Palestinian, 380
parameters, 6, 8, 43, 50, 93, 112, 195200, 232,
241, 268, 334, 421, 443
Park, 499
paronymic attraction, 219
Pater, 8, 12, 462, 499
pattern attractors, 491
pattern force, 148, 151, 153, 156, 157
pausing, 365
Pearson, 24, 54
Peizer, 21, 52, 115, 165, 172, 211, 224, 233,
291, 313
Peperkamp, 460, 500
Perani, 458
perception, 1, 4, 6, 19, 21, 42, 48, 51,
7689, 174, 176, 198, 201, 204, 209, 259,
260, 264, 268, 319, 320, 325, 363, 460, 462,
483, 495
perceptual magnet, 472, 486
perceptual processing, 23, 263
perceptuomotor link, 262
performance, 18, 19, 89, 113, 172, 210
Peters, 165, 171, 208, 213, 404, 409, 483
Peterson, 241, 254
Pethick, 364, 372
Phillips, 416, 434
phone class/es, 93, 989, 103, 105, 108,
111, 114
phone tree, 93, 97, 99, 103, 105, 109
phoneme, 21
phonemicization, 161
phonetic control, 1556, 158, 176, 208
phonetic length, 38990
phonetic parameter, 50, 104
phonetic score, 442
phonetic specication, 6
phonic core, 47, 112
phonological awareness, 11213
phonological challenge, 3, 408
phonological conditioning, 418
phonological contrasts, 103, 105
phonological grammar, 48, 292, 308
phonological idiom/s, 93, 106, 113, 143, 169,
1936, 198, 208, 217, 312, 463, 467, 481,
488
phonological knowledge, 3, 7, 17, 40, 47, 172,
311, 317, 374, 474
phonological length, 374, 38990, 403, 405
phonological memory, 5
phonological organization, 17, 25, 41, 48, 50,
112, 113, 238, 282, 291, 345, 486
phonological priming, 478
phonological rule, 217
phonological structure, 421, 426, 427
phonotactic constraints, 494
phonotactic regularities, 47
Piaget, 87, 92, 162, 166, 264, 283, 289, 443,
448, 450, 451, 458
Pierrehumbert, 6, 7, 12, 41, 42, 449, 55, 460,
465, 470, 472, 475, 490, 498, 500
Pine, 54
Pisoni, 53
pitch accent, 419, 422, 4256, 437, 438
pitch patterns, 424
Piwoz, 341
place, 29, 39, 40, 71, 98, 138, 1423, 153, 157,
193, 195, 196, 202, 2704, 294, 304, 308,
313, 318, 329, 334, 352; see also place of
articulation
place harmony, 491
place of articulation, 98, 138, 153, 195, 196,
271, 308, 313, 346, 349, 357, 358, 466,
496; see also place
planar segregation, 267, 271, 274, 275, 281,
321, 337, 338
planning, 3, 23, 195, 310, 345, 357, 359, 360,
454, 493
Platt, 175, 213
play, 263, 284, 4445, 448, 455
Plnat, 332, 341
plosive, 71
Poeppel, 447, 450, 457
Polish, 3449, 353,
467, 487
Pons, 5, 12
Poole, 50, 52
Portuguese, 301, 355357, 473; see also
Brazilian Portuguese
position in the word, 3556, 357, 360, 362,
3689, 380
positional variants, 42
Index 513
practice, 5, 23, 24, 29, 40, 171, 202, 205, 374,
376, 396, 405, 454
preception bias, 176
preferred neural pathways, 284
preferred pattern, 244, 251, 252
pre-phonemic, 176
pre-utterance vowel, 1356
Prvost, 322, 324, 340
Priestly, ix, 2, 3, 9, 13, 1845, 189, 200, 213,
217, 227, 2345, 291, 3001, 314, 343, 360,
375, 406, 409, 441, 458, 462, 4667, 481,
491, 501, 503
Prince, 1, 8, 12, 13, 50, 54, 311, 314, 3778,
380, 403, 409
Principles and Parameters framework, 322
problem solving, 144, 160, 164, 188, 202,
209, 210
process, 19, 40, 43, 112, 144, 147, 148, 161,
1823, 1867, 193, 2246, 248, 304, 333,
348, 355, 3589, 362, 368, 4067
processes, 8, 22, 26, 43, 4850, 84, 112, 140,
144, 147, 1512, 157, 158, 161, 187, 193,
268, 304, 307, 3435, 3529, 362, 365, 380,
415, 4645, 491
processing, 168, 284
production, 16, 10, 1929, 402, 48, 51, 111,
1356, 138, 147, 1512, 156, 17381, 185,
192, 195204, 208, 209, 23853, 25960,
26884, 294304, 308, 31113, 31822,
325, 328, 331, 3349, 344, 348, 349, 35460,
364, 371, 3745, 395, 423, 431, 437, 455,
4612, 472, 474, 4767, 487, 490
production patterns, 40, 239, 240, 244, 245,
248, 249, 251, 253, 260, 262, 266, 280, 283,
399, 441
production routines, 282
program, 169, 170, 195200
programming, 169, 170, 195, 197, 200
progressive idiom/s, 106, 107, 269
progressive phonological idioms, 193, 208, 481
prominence, 6578, 89, 324, 368, 3746, 390,
395, 397
pronunciation, 169, 178, 204
prosodic, 2, 5, 6, 22, 41, 46, 49, 50, 61, 114,
133, 137, 148, 151, 157, 163, 209, 239, 260,
271, 31922, 3246, 330, 332, 3357, 375,
422, 427, 428,
431, 477
prosodic heightening, 260
prosodic phonology, 321
prosodic structure, 2, 6, 46, 319, 324, 326, 328,
333, 337, 378, 466
prosodic template, 330, 332
prosodic unit/s, 151, 404
prosodic word, 271
prosodically highlighted, 262
prosody, 5, 66, 320, 337, 375, 466
proto-determiner/s, 330, 3323, 335
proto-words, 1702, 176, 448
psychological reality, 161, 170, 265, 284
Pulvermller, 451, 458
Pye, 418, 434, 483, 501
quantitative, 407
quantity, 3689
quantity-sensitive, 378
Radical Construction Grammar, 41, 49
Ramsay, 263, 289, 448, 458
Ramus, 320, 341
Rangel, 294, 314
Ravid, 377, 409
real time, 4634
real-time mapping, 464
recidivism, 191, 193
recognition, 21, 47, 87, 93, 153, 1734, 179,
205, 232, 238, 242, 264, 320, 4712,
475, 480
Redanz, 341, 363, 370
reduction, 96, 107, 1389, 144, 1468, 1512,
155, 158, 163, 199, 332, 3479, 351, 353,
365, 367, 381, 398, 412, 431
redundancy, 465, 474, 495
reduplicated, 29, 61, 689, 71, 758, 82, 878,
106, 111, 2967, 305, 307, 310, 3289, 332,
334, 367
reduplicating, 331
reduplication, 278, 334, 6578, 834, 107,
110, 194, 2312, 266, 294, 3035, 308,
31011, 3302, 3367, 3645
reduplicative, 75, 301, 310, 312
referential language, 265, 4501, 4534
referential language use, 265, 454
referential words, 264, 285, 4489, 453
regression, 8, 27, 239, 244, 269, 281, 291, 374,
405
regressive assimilation, 304
regressive idiom/s, 106, 153, 225
Reichling, 19
relational words, 284
reorganization, 133, 1512, 239, 241, 248, 252,
279, 308, 310, 328, 470
representation, 3, 69, 1719, 23, 423, 4650,
112, 144, 161, 175, 205, 213, 225, 229, 259,
2638, 2713, 2756, 279, 281, 2845,
2912, 307, 30910, 31213, 317, 320,
3367, 416, 441, 4448, 4501, 455, 4602,
464, 4689, 4712, 4756, 4789, 480,
4849, 4926
representational capacity, 270
representational play, 2635, 445,
448, 451
restructuring, 244, 248, 2512
resyllabication, 319
reversions, 219, 228
Reznick, 364, 370
514 Index
rhythm, 45, 31920, 334, 338, 454
rhythmic patterning, 23
rhythmic shape, 80, 408
Ribas, 313
Rice, 501
Richardson, 363, 371
Riggio, 456
Ritterman, 416, 433
Rizzolatti, 447, 456, 458
Rosch, 456, 4589
Rose, 322, 326, 341, 382, 385, 410
Roug, 261, 289, 457
Roug-Hellichius, 453, 457
rounding, 6578, 86, 187, 496
routine/s, 2, 63, 1356, 163, 1712, 199, 239,
242, 253, 284, 405, 4767
Rovee-Collier, 2623, 265, 289
Rowland, 54
Roy, 468, 480, 501
rule generalization, 1901
rules, 4, 8, 22, 26, 28, 42, 48, 80, 83, 93, 103,
1058, 11213, 133, 13840, 147, 1512,
1579, 1613, 16970, 173, 175, 179209,
217, 230, 238, 265, 268, 271, 274, 307, 343,
345, 35860, 4617, 4714, 477, 4804,
48890, 4934
ruse/s, 22933
Russian, 104, 182, 206, 464, 490
Rutherford, 226, 234
Saaristo-Helin, 365, 371
Saccuman, 458
Saffran, 1, 13, 475, 501
Sagart, 5, 11, 51, 261, 286
Sagey, 274, 289
Saleh, 380, 410
Salem, 380, 410
Salerni, 52
Salidis, 418, 434
salience, 6, 10, 186, 248, 251, 262, 280, 282,
285, 3445, 3634, 368, 377, 387, 390
salient, 36, 283, 357, 363, 368, 376, 406
Salo, 20, 323, 55
Salvi, 499
Sander, 187, 213
Saporta, 92
Sartre, 263, 289, 450, 458
Savinainen-Makkonen, viii, ix, 10, 13, 36, 55,
314, 341, 344, 360, 36271, 376, 405, 410,
418, 434, 4667, 503
Schade, 23, 51
Scheer, 7, 8, 13, 338, 341
schema/s, 7, 21, 26, 87, 90, 111, 157, 277, 307
schemata, 359
Schiller, 417, 433
Schmidt, ix, 7, 12, 458, 461, 481,
484, 496, 500
Schwartz, J.-L., 500
Schwartz, R. G., 24, 55, 178, 211, 213, 284,
287, 289, 416, 434
Scifo, 458
Searle, 446, 458
Sedang, 44, 45
segmental, 7
segmental organization, 7
segmentation, 5, 173, 239, 253, 267, 320
segments, 56
select, 289, 32, 36, 405, 488
selected, 23, 23, 28, 32, 367, 94, 133, 144,
147, 149, 151, 153, 182, 185, 241, 280, 284,
2967, 3034, 327, 339, 351, 376, 391,
3956, 3989, 406, 425, 442, 496
selecting, 25
selection, 24, 29, 38, 79, 88, 164, 178, 182, 184,
194, 198, 209, 269, 282, 285, 329, 374, 398,
406, 415
selection rule/s, 199200, 464, 496
selectivity, 103, 108, 262, 368, 442
self-monitoring, 2023, 260, 262, 284
self-organization, 2
Selkirk, 321, 338, 341
semantic bootstrapping, 317
sensorimotor, 2, 5, 4512
sequencing, 209
Shahin, 380, 405, 410
Shaw, 474, 499
Shoeib, 380, 410
short-term memory, 2012; see also memory
sibilance, 6578, 81, 85, 86
Silber, 238, 255, 465
Simmons, 13, 56, 240, 256, 262, 290, 342
simplication, 48, 96, 13940, 152, 158, 1612,
196, 463
Sinclair, 319, 330, 3345, 342
single lexicon, 173
single-valued features, 43, 48
single-word period, 3
Siqueland, 11
skeletal tier, 275
slips of the ear, 137, 472
slips of the tongue, 1513
Slobin, 113, 168, 213
Smith, B. L., 182, 213
Smith, F., 92
Smith K. D., 445, 55
Smith, L. B., 291, 314, 451, 458
Smith, N. V., 78, 10, 13, 49, 51, 53, 114, 116,
134, 13840, 162, 166, 170, 173, 175, 182,
188, 1912, 197, 213, 238, 255, 343, 361,
461, 463, 482, 501
Smolensky, 8, 13, 466, 495, 498, 501
Snow, C. E., 171, 213
Snow, K., 226, 234
social parameter, 104
sonority, 48, 6578, 347
Sosa, 431, 434
Index 515
sound effects, 270
sound-play, 139, 1702, 204
Spanish, 3, 20, 22, 25, 32, 13344, 153, 1558,
1614, 168, 182, 185, 207, 404, 467, 471,
473
Spencer, A., 268, 289
Spencer, P., 452, 458
spreading, 321, 331
spurt, 269, 384
Stager, 23, 55, 57, 320, 341
stages of development, 8, 95, 106, 152, 245,
368, 408
Stampe, 8, 13, 187, 193, 213, 343, 3612, 371
statistical learning, 1
Steiner, 166
Stemberger, 20, 55, 266, 268, 284, 289, 362,
370, 463, 483, 498
Sterne, 171, 213
Stevens, 213, 499
Stich, 447, 458
Stockman, 239, 255
Stockwell, 137, 166
Stoel, 24, 55, 163, 1656, 238, 241, 2545, 262,
268, 314, 372, 431, 434, 483, 501
Stoel-Gammon, 24, 55, 238, 255, 262, 268,
2878, 291, 3623, 367, 370, 372, 376, 385,
410, 431, 434, 465, 476, 483, 501
Stone, 218, 234
stop/s, 25, 28, 40, 44, 51, 6578, 818, 99,
11011, 138, 143, 148, 153, 156, 1589, 163,
176, 179, 1824, 207, 235, 243, 245, 248,
2523, 274, 281, 285, 292, 308, 327, 3526,
390, 404, 47781
storage, 168, 2656
Storkel, 6, 13, 465, 501
strata, 474, 480, 482
strategies, 93, 110, 113, 13940, 165, 168,
17784, 194, 196, 2089, 217, 2249, 232,
321, 326, 355, 365, 441
stress, 4, 301, 6578, 89, 176, 200, 220, 231,
286, 293, 3034, 30910, 3205, 328, 336,
363, 368, 376, 378, 3801, 41819
stress placement, 323
stressed syllable, 44, 77, 1478, 151, 297, 324,
375
strongly articulated, 72, 77, 79, 813, 87
Studdert-Kennedy, 23, 55, 238, 255, 343,
357, 361
sublexical patterns, 48
substitution/s, 26, 80, 84, 86, 138, 144, 1478,
1529, 164, 179, 2246, 232, 238, 248, 253,
307, 309, 339, 345, 349, 3545
Sullivan, 262, 289
Sundberg, 56, 432, 435, 502
suprasegmental feature, 403
Suxanova, 218, 234
vakin, 231, 234
Swedish, 20, 40, 43, 365, 417
Swingley, 6, 13, 416, 435
Swinney, 491, 501
syllabic language, 334
syllable positions, 44, 46, 48, 344, 473
syllable structure, 48, 15860, 334, 37781,
3845, 404
syllable weight, 378, 419, 422, 426, 431
syllable-timed language, 323
symbol/s, 446, 449
symbolic, 17, 46, 50, 172, 259, 262, 264, 445,
449, 455
symbolic play, 455; see also play,
representational play
systematic, 3, 20, 26, 28, 103, 133, 148, 173,
179, 184, 189, 200, 21718, 238, 240, 245,
248, 249, 253, 265, 271, 280, 31819, 324,
3267, 329, 335, 345, 3489, 394, 403,
4056, 415, 4269, 428, 473, 487, 496
systematicity, 26, 158, 219, 238, 259, 271, 284,
291, 297, 326, 345, 351, 4067
systematization, 2389, 2512, 266,
279, 297
Szreder, ix, 10, 405, 4668, 487
target language, 317, 318, 334, 338
task, 203, 204
Teixeira, 294, 314
Temne, 42
template/s, 19, 1734, 3850, 1845, 194,
200, 267, 269, 2917, 30013, 31721,
32539, 343, 350, 35660, 362, 364, 3689,
3747, 395, 403, 406, 431, 4412, 446, 450,
4525, 463, 467, 4727, 48195
template matching, 367
templatic, 3, 394
Templin, 96, 116
Terrell, 416, 434
Tettamanti,, 451, 458
Thal, 168, 364, 372
Theakston, 54
Thelen, 284, 289, 291, 314, 445, 4512,
4579
Theoret, 448, 457
theory, 151
Thiessen, 1, 13, 475, 501
Thompson, 456, 459
Tincoff, 55
Tishman, 239, 255
Toda, 42
Todd, 51
Todorovna, 499
Toivainen, J., 366, 372
Toivainen, K., 368, 372
token frequency, 294, 312, 417, 466,
488; see also frequency
Tokowicz, 471, 501
Tomasello, 41, 55, 341
Ton, 20, 52, 279, 286, 319, 340
516 Index
trade-off, 97
Tranel, 338, 342
transduction, 1889, 198, 2003
Treiman, 48, 55
Tremblay, 322, 324, 335, 340
trial-and-error, 1869, 196, 208
trochaic, 378, 322, 338, 362, 37581
trochaic bias hypothesis, 322
trochaic foot, 322, 338; see also foot/feet
trochaic pattern, 376
truncation, 278, 324, 322, 325, 3305, 363,
365, 381, 418, 42032, 4378, 468
Tucker, 450, 452, 459
Tuomi, 363, 372
Turunen, 365, 372
two-lexicon model, 7, 173, 1978, 461, 471,
496
Tyler, A. A., 416, 435
Tyler, M., 1, 12
type frequency, 48, 312, 417, 466, 488; see also
frequency
typological, 3267
typological constraints, 31719, 326, 330,
3334, 337
typologically constrained, 320, 338
underlying form/s (UF/s), 103, 217, 22733,
462, 479
underlying representation/s, 140, 337, 469,
479, 496
underspecication, 48, 267
underspecied, 6, 320
underspecied phonemes, 174
unit phrases, 135, 137, 163
unitary feet, 324; see also foot/feet
universal, 21, 36, 40, 62, 93, 109, 112, 1389,
147, 1603, 204, 311, 321, 338, 380, 407,
466, 487
Universal Grammar, 8
universality, 4, 324, 478
universals, 4, 40, 133, 160, 162, 186, 285,
460, 466
unmarked, 311
unruly contextual effects, 467
unspecied feature, 268
usage-based, 1, 17, 29, 41, 47, 4601, 464, 466,
469, 471, 478, 4856, 4901
U-shaped curve, 8, 374, 405, 463, 481
Uzgiris, 263, 289
vague, 7
vague representation, 6
Vainikka, 499
Vainio, 368, 372
van der Hulst, 7, 13, 43, 44, 48, 49
Van Gulick, 265, 289
Vance, 432
Varela, 456, 459
variability, 46, 223, 401, 103, 140, 146, 155,
158, 186, 200, 240, 245, 24850, 253, 281,
284, 308, 31112, 326, 335, 344, 3528, 415,
466, 471, 47880, 490, 496
variable, 57, 22, 1056, 140, 144, 148, 150,
152, 156, 1589, 171, 177, 195200, 208,
242, 245, 252, 276, 283, 2923, 326, 332,
352, 354, 357, 375, 390, 395, 404
variation, 8, 18, 21, 412, 489, 51, 97, 99,
1037, 112, 133, 135, 138, 140, 1502,
1558, 163, 172, 177, 18592, 199, 2012,
209, 218, 224, 239, 2445, 262, 2746, 291,
2934, 308, 311, 318, 329, 334, 3527, 380,
395, 406, 424, 426, 429, 431, 444, 452, 468,
471, 47983, 4901, 496
velar/s, 352, 353
Velleman, P. F., 249, 255
Velleman, S., ix, 23, 1314, 23, 25, 29, 36,
38, 41, 56, 2658, 2834, 28991, 2967,
31519, 3423, 351, 361, 364, 368, 372, 376,
405, 410, 415, 435, 442, 453, 459, 469, 475,
480, 5023
Velten, 612, 68, 92, 182, 190, 213
Veneziano, 319, 330, 3345, 342
Vergnaud, 8, 12
Verlaine, 323
Verluyten, 3224, 342
Viegas, 293, 314
Vigorito, 11
Vihman, M. M., ix, 114, 17, 20, 225, 2930,
3641, 51, 545, 138, 167, 180, 184, 190,
210, 213, 23841, 2556, 2602, 26670,
277, 28390, 291, 2957, 31315, 31720,
326, 333, 335, 338, 3415, 351, 3612, 364,
36876, 382, 384, 398, 4056, 410, 41518,
4323, 435, 4412, 444, 445, 4513, 4567,
459, 461, 463, 46772, 4756, 4803, 489,
496, 5003
Vihman, V-A., 14, 291, 31819, 375, 405
Vitevich, 22, 47, 56
Vlahovi, 232, 234
vocabulary, 48, 624, 88, 107, 11011, 134,
164, 197, 2012, 206, 241, 284, 2912, 295,
330, 345, 382, 384, 399, 405, 407, 41617,
444, 467, 470, 4801, 488, 490
vocabulary size, 48, 470
vocal capacities, 260
vocal control, 260, 452
vocal exploration, 284
vocal motor action, 266
vocal motor scheme/s, 260, 262, 2656, 270,
271, 273, 280, 2825, 442, 448, 4524
vocal motor skill, 266
vocal patterns, 260, 262
Vogel, 207, 213, 321, 341
voice, 71, 83
voiced, 69
voiceless, 85
Index 517
voicelessness, 6578, 86, 114, 148, 151, 162
voicing, 51, 6578, 86, 1056, 108, 111, 137,
146, 153, 1558, 176, 180, 1823, 187,
1901, 195, 197, 207, 252, 276, 3523, 355
Volterra, 240, 254, 262, 286, 444, 456
von Rafer-Engel, 170, 213
Vorperian, 480
vowel sequences, 29, 323
Wales, 92
Walley, 4323
Wang, 1045, 114, 116, 312, 315, 416,
435, 503
Ward, 454, 459
Waterson, ix, 1, 3, 7, 14, 19, 202, 256, 57, 61,
63, 65, 92, 111, 116, 157, 167, 174, 1946,
199, 202, 210, 214, 2334, 2389, 256, 259,
2656, 270, 290, 315, 317, 3423, 361, 376,
410, 415, 431, 435, 443, 450, 459, 4623,
4667, 476, 481, 5023
Watson, 3768, 4034, 410
Wauquier, ix, 45, 33940, 342, 375, 410, 466
Wauquier-Gravelines, 318, 321, 337, 339, 342,
385, 410
weakly articulated, 72, 77, 80, 82
Weeks, 21, 52, 57, 104, 111, 11516, 165, 172,
211, 224, 233, 291, 313
Weenink, 382, 408
Weir, 61, 92, 171, 202, 214
Weismer, 239
Weissenborn, 11
Wellmers, 195, 214
Welsh, 20, 36, 38, 3445, 357
Werker, 23, 55, 57, 320, 341, 475, 499
Werner, 266, 290, 443, 445, 44850, 459
Westbury, 182, 214
Wheeler, 268, 287
whisper, 95, 111, 119, 202
Whitaker, 12, 416, 433, 452, 457
whole-word, 3
Wijnen, 24, 52, 363, 372, 426, 435
Wilbur, 174, 206, 214
Wilcox, 178, 211, 284, 287
wild variation, 467, 468
Williams, K. A., 499
Williams, N. M., 12, 416, 433, 452, 457
Woods, 239, 255
Woodward, 341
word combinations, 455
word endings, 245
word identication, 241, 382
word meanings, 444
word pattern/s, 1, 7, 22, 24, 26, 35, 41, 133,
136, 1405, 14853, 15764, 248, 281,
375, 407
word production pattern/s, 280, 283, 285
word recipe/s, 238, 239, 242, 244, 267
word structure, 49, 6578, 152, 1612,
311, 329, 334, 366, 380, 384, 388, 42930,
487
word-nal consonant, 297
working memory, 335; see also memory
Wright-Cassidy, 340
Yaeger, 166
Yamaguchi, ix, 4, 5, 375, 410
Yeni-Komshian, 168, 214
Young, 340, 448, 457
Zamuner, 417, 435, 465, 502
Zawaydeh, 48, 53
Zelnicker, 163, 165, 167
Zlatin, 239, 256
Zonneveld, 8, 12, 462, 499
Zwicky, 203, 214
Zydorowicz, 344, 349, 350, 361
518 Index

Phono

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Phono

Uploaded by

Copyright:

Available Formats

The Emergence of Phonology

Final continuants l~r w j j

], after this she sometimes referred to herself as [nn] but not

] arose from the form [r nd ] and not [nn].

]; sh [/]; fetch []; vest

]; frontness in [, h] and syllable 1 of [h]; backness in syllable 2 of

]; centrality in syllable 1 of [ah u:, ah , a

] (only the strongest nasality was recorded when

]; and the structure

]; more close grade vowel in syllable 1 and more open in

]. The structures may be symbolized as follows: fully redu-

] angel, all share the

]; where the onset

]. Also, in the latter case, the nasal stop is

] were used with reference to angel and hymn. He

] for hymn-book, i.e., hymn/

] as the majority of monosyllabic adult words had monosyllabic corre-

], with nasality and stop and sibilant continuance

], i.e., all having a medial glottal continuant, and

], with voiced onset and

], the vowel of the rst syllable is more open than

nd u] window, [r nd ] Randall, and [n] another,

nd u]; more open

o]. In the case

g] 1st vowel close-mid, 2nd

nd u] 1st vowel close-mid, 2nd open-

] 1st vowel open, 2nd open-mid,

nd u], the articulation of the complex [nd], i.e., homorganic

nd u] and [n] are strongly articulated and are therefore

] and [hn] the nasal stops are at the onset of unstressed

] which is spread over the whole word in addition to

Figure 4.3. Phone trees of H

blanket (1) bijbj (imit.)

rock (6) wkuk (2), ukwk (2), ukuk, wk

dog (1) d (imit.)

purse (1) pti (imit.)

cheese (5) (2), :, d,

eye (s) (3) a, ai, a

tiger (2) tki (2)

thank you (1) t

see (9) di (4), di, i, z

dog (2) da, d (imit.)

], and [] as well as the citation form [o]. We

]; boy [bj]; cat [k

e:di] or [le:di]. So /l/ was optionally excepted from the general

]. It is easy to see what is happening here: the child

o]. Other children may put any

skks]:[majks]. With certain

klt], this MEF is anomalous as is

st] {m+ + s + } (W6)

rs] {p+ + s + } (W12)

k] (W4), BEF [fajak] (W6) {f + V

pow, ddi, kk, bjkan,

ti, bt, tdi, bdi, mgi, fwt, d

ni, hm, mni, hni, fni, wni, mn

, lziz, szi, szz, rs

skt bajak 7 bsak 12 basket

n bajan 3, 4, 6 bnan 5 banana

row pajm tomorrow

ndbrdmn bijan 7 dnmn 12 gingerbread-man

bdbg bajak 7 bgd 12 garbage-bag

gn maj 6 mwtg 8 mouth-organ