You are on page 1of 28

The preschool child with a specific language impair- ment poses particular problems

for speech-language pa- thologists and other professionals. On the one hand, it might
seem desirable to initiate therapy as early as possible to give the child the best
opportunity of over- coming the impairment before starting school. On the other
hand, the disorder might resolve naturally, and treatment could create more problems
than it solves by producing low expectations in teachers, anxiety in par- ents, and self-
consciousness in the child. The parents of a language-impaired child want to know
what the future holds, in particular, whether the child will be able to cope with regular
schooling. Should one offer reassurance or give a more guarded prognosis? This is an
area where even experienced clinicians find it hard to make deci- sions. Drillien and
Drummond (1983) attempted to pre- dict outcome for language-impaired children on
the basis of language assessment at 3 years. Using unspecified criteria they decided
that 22% of the sample probably had "minor" or "transitory" problems, and 17% were
consid- ered very likely to have considerable problems in school, leaving 61% about
whom they felt "much less certain."
If one turns to the research literature for information about prognosis, a rather
confusing picture emerges. Until recently, most studies concentrated on "functional
articulation problems" or "speech disorders" (i.e., what would today be termed
"phonological disorders"). The language problems of such children often extend
beyond phonology, as is evident in some of these studies. The general conclusion from
this early literature is that the proportion of children in a population who are regarded
as disordered declines with age. Morley (1972) investi- gated speech development in
944 children born in Newcastle-upon-Tyne, England, in May and June 1947. When
seen at just under 4 years of age, 17% had "defec- tive articulation" not attributable to
organic disease. On
*Current affiliation: University of Manchester, England.
**Current affiliation: Birkbeck College, University of London, England.
follow-up at 43/4years, this figure had dropped to 14%, and by 6Y~years to 3%.
Renfrew and Geary (1973) found that around 50% of children who had significant
speech errors on school entry (around 5 years of age) were speaking normally 6
months later. As in Morley's study, there was evidence that the initial language
problems of these children extended beyond articulation. Such results indicate that
some young children have a transient delay in language development but then catch
up with their peers and show no long-term impairment.
How common is such "recovery"? Very varied esti- mates are given by different
authors. MacKeith and Rutter (1972) suggested that 1% of all children identified as
having "language difficulties" as preschoolers (including children with additional
handicaps) would enter school with a handicapping language difficulty, while an addi-
tional 4°/~-5% would have educational problems. In con- trast Drillien and Drummond
(1983) reported that only 60% of children identified as language disordered at 3 years
were subsequently "problem free" in school. One factor affecting results will be the
criteria for assessing outcome: Objective assessment may reveal deficits in children
who on superficial appearance seem normal. Griffiths (1969) followed up pupils who
had attended a special residential school for children with specific lan- guage
impairments. Of 26 children who were thought to be doing well enough to return to
mainstream education, only 5 were later found to be making satisfactory progress.
Several longitudinal studies of language-im- paired children have reported a high rate
of linguistic, educational, and social impairments persisting many years after the
language impairment was first diagnosed (Aram, Ekelman, & Nation, 1984; Aram &
Nation, 1980; Fundudis, Kolvin, & Garside, 1979; Stark et al., 1984).
It seems, then, that although a preschool language disorder may be a transient
developmental delay that will resolve in time, in many cases, as Aram et al. (1984)
remarked, "The language disorders recognized in the preschool years are only the
beginning of long-standing language, academic, and often behavioral problems" (p.
© 1987, American Speech-Language-Hearing Association
156 0022-4677/87/5202-0156501.00/0

240). The important question, then, is whether we can distinguish young children who
are likely to have persist- ing problems from those whose disorders are transient.
By far the most ambitious attempt to date to correlate early variables with subsequent
language development is the study reported by Schery (1985). She compiled data from
the records of 718 children and identified five clusters of predictor variables (cognitive,
socioeconomic, physical, language history, and socio-emotional/personal- ity) that
were then used to predict subsequent progress. She found that performance level at
initial test was an overwhelming determinant of performance level 2-3 years later.
However, this is hardly surprising given that the age range of the sample was 3--16
years, and scores were not age adjusted. Schery's main interest was in seeing how well
one could predict amount of progress made over a 2-year period, after partialing the
effects of initial score. She found that younger children tended to make more progress
than older children, and that per- formance IQ also accounted for a small but
significant amount of variance. However, 74% of variance in gain in language scores
was unexplained. Although Schery pro- vides a tentative list of variables that can be
used to predict language gain, the relationships she describes are clearly far too weak
for reliable prediction in individual cases, and her result leads to pessimism about
whether accurate prediction will ever be possible.
Bishop and Rosenbloom (in press) suggested that prog- nosis might be related to the
pattern of language impair- ments. They suggested that there might be two distinct
subgroups in the preschool language-impaired popula- tion. The first would consist of
children who are devel- oping slowly, but along normal lines. Such children, who have
an even retardation of language skills and resemble a younger normal child in language
development, would be likely to have a good prognosis. The other hypothe- sized
subgroup consists of those whose language devel- opment is very uneven (e.g., with a
severe and selective phonological disorder but normal comprehension). The
suggestion is that the abnormal language in these chil- dren does not correspond to a
normal developmental variation but reflects some specific factor that interferes with
language learning. This hypothesis, which we will term Hypothesis A, thus predicts that
an uneven pattern of language impairment will have a worse prognosis than a more
even delay. Results of this kind have been re- ported for other developmental
disorders. Rutter and Yule (1973) found that children whose reading was se- verely
retarded but who had normal nonverbal intelli- gence made poorer progress than
children whose poor reading ability was consistent with their low nonverbal ability.
The explanation seems to be that the child with a general disability is learning by the
normal route, albeit slowly, whereas in the child with a highly specific prob- lem, some
particular deficit is interfering with normal learning processes so that the skill can only
be mastered if an alternative learning strategy can be found.
However, a retrospective study of "communicatively impaired" children by King, Jones,
and Lasky (1982) suggested that an isolated impairment in one area of
functioning has a better prognosis than more generalized difficulties. Current
communication status was investi- gated using telephone interviews with a parent.
Outcome was worst for those who initially had "no speech," inter- mediate for those
with "language and articulation" prob- lems, and best for "articulation problems." In a
similar vein, Hall and Tomblin (1978) reported that isolated articulation impairments
had a better prognosis than more generalized language difficulties.
This sort of finding suggests two alternative hypotheses about how pattern of
language impairment might relate to prognosis. Hypothesis B maintains that speeifie
language impairment is heterogeneous in etiology, symptomatol- ogy, and prognosis.
Some disorders, such as isolated phonological problems, might have a good prognosis,
whereas others do not. Hypothesis B does not make any clear predictions about the
relationship between selectiv- ity of language impairment and prognosis; but it does
predict that provided we can accurately discriminate distinct patterns, we should not
find children changing from one pattern to another (i.e., the pattern of impair- ment
associated with each underlying disorder should be relatively stable).
We may formulate a final hypothesis, which is consist- ent with the results of King et
al. (1982) and Hall and Tomblin (1978) but which makes different predictions from
Hypothesis B. Hypothesis C regards specific lan- guage impairment as a unitary
disorder, where different aspects of language function are differentially vulnerable so
that the pattern of impairment observed depends on the overall severity of the
disorder.
Figure 1 illustrates Hypothesis C by drawing an anal- ogy with a set of mountains that
will be submerged or exposed depending on the water level. An exposed mountain
corresponds to an impairment in that area of functioning: The more of the mountain
that is exposed, the more severe the impairment, and the higher the mountain, the
more vulnerable the function. Figure 1 illustrates a situation where expressive
phonology, syn- tax, and morphology--which are mastered relatively early in normal
development--are particularly vulnerable; other interpretations of Hypothesis C are, of
course, possible. Overall severity of language impairment is
BISHOP & EDMUNDSON: Transient and Persistent Language Impairment 157
Low A
°' ....°"-- :l: ::X :-: ::l-:::V : -:_
: : -:-:-: : -:c
phonology
syntax+morphology
semantics
<......... EXPRESSIVE LANGUAGE ......... •
F~GURE1. "Submerged mountains" analogy to specific language impairment.
RECEPTIVE LANGUAGE

158 Journal of Speech and Hearing Disorders


52 156-173 May 1987
analogous with water level. As'children recover (corre- sponding, in the model, to the
water level rising), the pattern of disorder should change into one higher in the series
(e.g., a child with a general expressive impairment affecting syntax and morphology,
phonology, and seman- tics should improve so that only phonology and syntax and
morphology were impaired). According to this model, the number of areas of language
function that are im- paired is a direct reflection of severity of overall impair- ment, so
this hypothesis makes the opposite prediction of Hypothesis A: Pervasive impairment
of all language func- tions will have a worse prognosis than selective impair- ment. One
further prediction that arises from consider- ation of the model in Figure 1 is that the
severity of impairment within any one area of language functioning will be directly
proportional to the number of areas of impairment. For example, the severity of
phonological impairment in a child WJaOhas a selective phonological problem will be
less th~n in a child with impairments in all areas of functioning.
Our study was designed to overcome some of the limitations of previous work and to
address the following questions:
1. What range of severity of language problems is found in a sample of children
referred for professional help because of concern about language development at 4
years, and how many of these children "recover" from their early difficulties?
2. Is outcome of preschool language impairment re- lated to severity of impairment, as
reflected in language test scores, and, if so, is the relationship strong enough to enable
one to predict outcome reliably for individual children?
3. Is there any regularity in the patterns of language impairment that are found in
preschool children, and, if so, is outcome related to pattern of impairment? In
particular, does an uneven profile of language impair- ment have a better or poorer
prognosis than a more even pattern?
METHOD
Subjects
Rationalefor selection of subjects. Because one aim of the research was to provide
data that would help speech- language pathologists and other professionals predict
outcome for preschool children referred to them, it was decided not to use rigid
selection criteria based on language test scores, but rather to take as a starting point a
group of children who were selected simply because they had been referred for
professional help because of concern that their language development was impaired
for no obvious reason. An alternative approach would have been to use an objective
criterion based on test scores, such as that proposed for "specific language im-
pairment" by Stark and Tallal (1981), but we felt that although this may be a useful
definition to employ in
some research settings, it would not be appropriate here because it would exclude
many children of interest. Stark and Tallal found that their criterion applied to less
than half of a group of children with normal performance IQs who had been referred
as having language impairment. Their definition explicitly excludes children with
severe expressive difficulties and normal receptive language, and others with
disproportionately severe articulation problems. Such children are frequently
encountered in the clinic and are often precisely those who pose prog- nostic
problems. They are of particular interest to us because of their uneven pattern of
language impairment. In addition, many children who are referred to speech- language
pathologists fail to meet the Stark-Tallal crite- rion because their language difficulties
are not severe enough, Yet retrospective studies of children with edu- cational
difficulties suggest that even relatively mild delays in early language development are
associated with later problems, and a recent prospective study demon- strated a
strong link between one aspect of phonological competence in the preschool years
and later reading skills in children who were not regarded as language disordered
(Bryant & Bradley, 1983). For these reasons, rather than begging the question of what
constitutes a significant language impairment by using predefined test criteria, we
investigated this question empirically by taking a group of children whose language
development was giving cause for concern and then set out to study the range and
severity of language problems encountered and to relate these to outcome.
Although a control group was not crucial to the design of the study, which involved
contrasting language-im- paired children with good and poor outcome, it was felt
desirable to obtain some control data, especially on those measures that were not
adequately standardized, so that we could gain some idea of the level of impairment in
the language-impaired children relative to a normal sample. Resources did not extend
to studying a control group longitudinally, but it was possible to test cross-sectional
samples for each of the ages at which language-impaired children were assessed.
Language-impaired sample. Children were recruited for the project via speech-
language pathologists and pe- diatricians in the northeast and northwest of England
who were asked to refer to us any child aged between 3 years 9 months and 4 years 2
months who was being seen because of concern about language development, who
was not regarded as intellectually retarded, and whose parents were willing for him or
her to participate. We asked referring agencies to exclude children with sensorineural
hearing loss, physical handicap, cranio- facial abnormalities, infantile autism, and those
where English was not the first language spoken at home. (No such problems were
detected in our sample either by us or by the other professionals involved with these
children in the course of the study.) In Britain, children's hearing is routinely screened
in the first year of life by health visitors and again by medical officers at school entry,
and we relied on the results of those tests to ensure that none of our sample had
sensorineural hearing loss. Because

recurrent secretory otitis media is extremely common in preschool children and there
is considerable debate as to its impact on language development, we did not exclude
children with a history of otitis media but did analyze our results to see if we could find
any evidence that children with siach a history differed from the remainder of the
sample. Results from this part of the study, which were largely negative, are reported
elsewhere (Bishop & Edmunds0n, 1986), and for the purposes of this report children
are not subdivided according to this variable.
Between April 1982 and April 1984, 88 children (72 boys and 16 girls) were enrolled in
the study. All but 1 of these children were assessed by one of the authors on three
occasions (see below). The parents of the remaining child had become unhappy about
an assessment carried out by local services to determine educational placement and
were unwilling for him to undergo the final language assessment,
It was not feasible to control the amount of speech/language therapy the child
received during the study, but details of therapy received by the child over the course
of the study were recorded. Each child was rated according to amount of therapy
received during the 18-month period of the study on a 4-point scale, where 0
corresponded to no therapy, 1 to less than 11 hr of therapy, 2 to more than 11 hr of
therapy, and 3 to at least 2 months of daily attendance at a language unit.
Control group. Control children came from four schools and nurseries in the northeast
Of England. (Con- tact had previously been made with these institutions, because in
each case one or more of the language- impaired children had attended the school or
nursery.) We aimed to test a sample of at least 20 children on each test in each age
band, with a similar sex ratio to the language-impaired sample (80% boys). In each
nursery, all boys of suitable age who were available during the testing session were
seen, plus a small number of girls selected at random from the pool of available
children. The only exclusions were for children receiving speech therapy, and One or
two children who were timid of strangers. The nursery scheduling meant that children
were not available for sessions of more than half an hour, which meant that it was not
possible to give all tests to all children (see below for procedure). Data were collected
on thirty-seven 4-year-olds (mean age: 47.5 months; SD = 1.8), twenty-three 4Y~-year-
olds (mean age: 53.0 months; SD = 1.68), and nineteen 5Y~-year-olds (mean age: 65.2
months; SD = i.44).
Procedure
Language assessment: Rationale for selecting a test battery. We thought it would be
particularly useful to see how far one could gain reliable predictive information from a
single assessment session such as would be feasi- ble in a clinical setting. This ruled out
lengthy proce- dures that would involve several sessions with the child and forced us
to choose between using the available time for a detailed assessment of a limited
range of language
functions or for investigating a broader range of skills in less detail. Because of our
interest in relating outcome to the pattern as well as overall level of impairment, we
chose breadth rather than depth of assessment and se- lected a test battery that gave
measures of expressive competence in phonology, syntax and morphology, se- mantic
relationships and vocabulary, and receptive abil- ity to handle grammar and
vocabulary, as well as a more general measure of verbal comprehension of
instructions.
Language assessment: Content. The language test bat- tery is described in detail in the
Appendix. The two multiple-choice comprehension tests, the British Picture
Vocabulary Scale (BPVS) and the Test for Reception of Grammar (TROG), occurred at
the end of the session and were not given to all children in the first two assessments
because some subjects were tired and inattentive by this point. All responses to the
Action Picture Test and Bus Story Test were analyzed by a computerized package for
grammatical analysis, which performs a semiautomated LARSP analysis (Crystal,
Fletcher, & Garman, 1976) and computes mean length of utterance in morphemes
(MLU) (Bishop, 1984). Total number of utterances was defined as all utterances made
during these two tests, including
social phrases and unintelligible utterances. The number of utterances that were
totally or partially Unintelligible was computed as a percentage of total number of
utter- ances. This is a strict criterion for unintelligibility, whereby an utterance scores
as unintelligible even if it is only one word that cannot be identified. Spontaneous
responses to the Naming Vocabulary and the Newcastle Speech Assessment were
transcribed live and then again from tape by both authors. Where there was disagree-
ment, one listener heard the recording again and decided on a final version. Instances
where the child repeated the word after the tester or where a word was repeated in
the transcript were deleted. If the child gave more than one phonetic form for a word,
then that closest to the adult form was included. Because children varied in the num-
ber of pictures they could name spontaneously, the num- ber of words analyzed was
not constant but averaged 44 words at 4 years, 55 words at 4Y2 years, and 62 words at
5b/2 years. Percentage consonants correct was computed according to the criteria of
Shriberg and Kwiatkowski (1982).
The language measures were grouped into four main categories when analyzing
patterns of language disabil- ity. Our original aim had been to distinguish between the
different components of receptive language that were tapped by the three
comprehension tests, but, as noted above, the test battery proved too long for many
4-year- olds so that there were frequently missing data; it was decided, therefore, to
treat "language comprehension" as a single category in the pattern analyses. The
measures corresponding to each category were as follows:
Expressive language: Phonology
Percentage consonants correct
Expressive language: Syntax and morphology
MLU
Action Picture Grammar score
BISHOP & EDMUNDSON: Transient and Persistent Language Impairment 159

160 Journal of Speech and Hearing Disorders


52 156-173 May 1987
Expressive language: Semantics
Bus Story Information score
Action Picture Information score
British Ability Scales (BAS): Naming Vacabulary
Language comprehension
British Ability Scales (BAS): Verbal Comprehension BPVS
Assessment of nonverbal ability. A modified, shortened version of the Leiter
International Performance Scale (Leiter, 1948) was administered to all language-
impaired children at 4 years. Children were given the first 2-year- old subtest, plus all
the 3-year-old and 4-year-old subtests. Children who passed all 3-year-old subtests
were credited with passing the 2-year-old subtests, whereas children who failed any of
the 3-year-old subtests were given all the 2-year-old subtests also. Total number of
subtests passed (out of 12) was recorded. The Goodenough-Harris Draw-a-Man Test
(Harris, 1963) was given at the final assessment.
Scheduling of assessments. All language-impaired chil- dren were assessed when aged
between 3 years 9 months and 4 years 2 months (mean age: 48.1 months; SD = 1.62),
again 6 months later (mean age: 54.3 months; SD = 1.76), and finally for follow-up 18
months after initial assess- ment (mean age: 66.3 months; SD = 1.69). The assess- ment
lasted 45-75 min and took place in a quiet room, usually at a speech therapy or
pediatric clinic or at school, although a few children were seen at home if a suitable
room was available.
Testing of control children was fitted around nursery activities, and not all control
children were given the whole battery. However, the only factors determining whether
a particular child was given a test were practical scheduling considerations and not the
child's apparent ability or willingness to attempt a test. Twenty-five con- trol 4-year-
olds and all older control children were given the language assessment tests except for
TROG and BPVS (for which comprehensive recent norms were avail- able).
Phonological development approached ceiling lev- els in control children at 4 years, so
phonology was not assessed in the older control groups, Performance on the BAS
Verbal Comprehension scale approaches ceiling by 4V2years, and norms are not
available for children aged 5 years and over, so this test was not given to the oldest
control group. The modified Leiter test was given to 8 control 4-year-olds who had also
done the language tests, plus a further 12 control children of this age. All control 5~2-
year-olds were given the Goodenough-Harris Draw- a-Man Test.
Defining outcome according to test scores. It seemed important tO use a fairly
stringent criterion, whereby a child was only regarded as having a good outcome if he
or she was essentially indistinguishable from controls. This meant that procedures that
classified outcome according to average performance on a range of language tests at
follow-up would be inappropriate because they would be liable to classify as good
outcome a child who had a marked impairment in a single area of functioning, such as
phonology, with average or above average scores on
other tests. It seemed preferable to use a criterion that defined good outcome in a
way that excluded children who had a severe impairment in any aspect of language
function, irrespective of their average scores. However, defining good outcome simply
as having no score that was severely impaired seemed inadequate, as such a defini-
tion could include many children whose language was moderately impaired in all areas
but who did not have a strikingly severe deficit on any test. We therefore im- posed an
additional constraint that a child had to meet to be regarded as good outcome: Scores
should be at least "satisfactory" on the majority of measures.
To make this criterion operational, we had to define two cutoffs for each language
test: one defining an "im- paired" level of performance and the other defining a
"satisfactory" level of performance. The 3rd centile seemed a suitable cutoff for
impairment. The 10th centile was used as a cutoff for satisfactory scores. To have
required the child to score above the 1Oth centile on all measures would have been
too stringent; one can show using the binomial theorem that if the probability of
scoring above a criterion on any one test is .9, then the proportion of children who are
expected to score above this level on nine out of nine measures is only .387. Thus, it
seemed more reasonable to allow" for a child scoring below this criterion on one test;
the proportion of chil- dren who would be expected to score above criterion on at
least eight out of nine tests is .775.
Cutoff points defining impaired and satisfactory scores for each test are shown in Table
1. For Action Picture Test, Bus Story Test, Naming Vocabulary, BPVS, and TROG
satisfacto~nywas defined as a score at or above the 10th centile (corresponding to a z
score above - 1.29), and impaired was defined as a score at or below the 3rd centile (z
score below -1.89), these being derived from control data for the first three tests and
from test norms for the latter two. Because control MLU began to reach a plateau
after 4 years, and the variance was somewhat higher for the older children, use of z
scores to derive cutoffs for each age group would have led to the anomaly of having
less stringent cutoffs for older children. There- fore, the 10th and 3rd eentile cutoff
points derived from 4-year-01d controls were used at all ages. Control scores on the
BAS Verbal Comprehension Scale and percentage consonants correct approach ceiling
levels after 4 years of age, leading to skewed distributions. The cutoffpoints for verbal
comprehension were derived from control data at 4 and 4Veyears. Because control
51/2-year-olds had not been given this test because it was deemed too easy for them,
criteria for rating impairment at final assessment were based on 10th and 3rd centile
points from test norms for the oldest age group for whom data are provided (43/4
years). For percentage consonants correct, cutoff points were defined on an ad hoc
basis, with a score correspond- ing to Shriberg and Kwiatkowskfs (1982) "moderate-
severe impairment" category regarded as impaired at 4 years and a score
corresponding to "mild-moderate im- pairment" as impaired at 5V2 years.
To sum up, our final criterion for good outcome was one that most children in the
normal population could be

BISHOP & EDMUNDSON: Transient and Persistent Language Impairment 161 TABLE1.
Criteria for rating performance.
Test
MLU
Naming Vocab VerbalComp 77 Action P Info 18 Action P Gram 16 BusStory 8 TROG
centile 10 BPVScentile 10 % cons correct 85
4.73
77 53 86 71 22 15 20 13 12 5 10 3 10 3 85 75
expected to meet and that excluded children with severe but isolated impairments in
one area of functioning. To be classified as good outcome, the child must (a) have no
score in the impaired range and (b) have no more than one score below the
satisfactory range.
RESULTS
Characteristics of Language-Impaired Sample at Initial and Final Assessment
Although we had aimed to exclude children with intel- lectual retardation, 19 of the
children in our sample scored more than two standard deviations below the control
mean on the modified Leiter test. Data from this subset of children, who will be
referred to as the general delay (GD) group, are treated separately from that of the
remainder of the sample, referred to as the specific language-impaired (SLI) group.
The criterion that we developed to evaluate outcome status can be applied to the data
obtained at initial assessment to check that the sample was indeed language impaired
at 4 years of age. At this age, only 2 of 87 (4%) children in this sample achieved the
criterion of having no score in the impaired range and no more than one score below
the satisfactory range. (Both these children had mild-moderate phonological
impairment by Shriberg and Kwiatkowski criteria.)
When the same criterion was used to divide children into those with good and poor
outcome at final assess- ment, it was found that 30 of 68 (44%) children with specific
language impairment and 2 of 19 (11%) of those with general delay have good
outcome at 5V2years.
Results for children with good and poor outcome at each age are shown separately for
children with general delay and specific language impairment and contrasted with
control data in Table 2. One-way analysis of variance was used to compare scores of
control children with those of all language-disordered groups except for the general
delay poor outcome group, which contained only 2 sub- jects. The 5% level of
confidence is adopted for these analyses. Where the overall F ratio was significant, a
specific comparisons, results of
4 42/e Satisfactory:
score >-
80 20 18
8 10 10 85
51/2
Age (in years)
4 41/2 51/2
Impaired: score <-
4.07 4.07 4.07
4.73 4.73 60 63
55 74 18 15
533
75
70 78 20 18
833
84
4 4I/2
% sample
impaired
79 46 24 26 i4 23 26 13 5 51 34 16 72 58 26 30 11 4 24 14 23
7 6 18 74 55 34
Considering first the performance of language-im- paired children relative to the
control group, as might be expected, the SLI poor outcome group score consistently
below control levels at initial testing and continue to show significant deficits relative
to controls on several measures at final testing. Scores of the SLI good outcome group
are not Significantly different from those of control children for any language measure
by 5V2 years although this group did do significantly more poorly than control children
on MLU, Action Picture Grammar, and percent- age consonants correct when aged 4
years. Thus, a sub- stantial proportion of children who did have measurable language
impairment when first assessed were no longer impaired 18 months later.
One might wonder whether the good outcome group was in fact not substantially
different from the control group at initial testing. Although we found some signifi- cant
test differences, ttaere were many measures where these were not found, and it could
be argued that when multiple one-way analyses are conducted it is likely by chance
that one or two comparisons will be significant. However, when we look not at mean
test scores, but at the number of tests where a child is impaired, the initial deficits of
the SLI good outcome are more striking. On the assumption that the probability of an
impaired score on any one test is .03, the probability of being impaired on two or more
out of nine measures is .039 (binomial theorem). In the good outcome group, the
proportion of children doing this poorly at 4 years was .633, and in the poor outcome
group the proportion was .947.
Relationship of Outcome to Severity of Disorder at 4 Years
Comparisons of the SLI good outcome and SLI poor outcome groups at 4 years (see
Table 2) reveal significant differences on all language measures except percentage
consonants correct and Test for Reception of Grammar (where sample sizes were
small). For each language measure where data were available on at least 80 chil-
Scheffd
these being shown in Table 2.
test was used for
51/~2

162 Journal of Speech and Hearing Disorders 52 156-173 May 1987 TABLE2. Scores
obtained at 4, 4Vz,and 51/zyears for children with specific language impairment (SLI),
general delay (GD), and control
group.
Number of utterances
A: Control 25 33.4 B: SLI-good outcome 30 38.1 C: SLI-poor outcome 36 40.7 D: GD-
good outcome 2 54.5 E: GD-poor outcome 16 43.2
F(3, 103) = 2.6 specific comparisonsa n.s.
% unintelligible
A: control 25 6.4 B: SLI-good outcome 30 15.0 C: SLI-poor outcome 36 26.7 D: GD-good
outcome 2 22.5 E: GD-poor outcome 16 35.3
(6.00) (11.08) (14.87)
(-) (15.33)
(6.32)
(9.47) (18:78)
(-) (19.13)
23 34.1 29 44.1 38 43.0
2 32.5 16 43.9 F(3, 102) =
n.s.
23 2.3 29 12.0 38 19.0
2 6.0 16 23.6
(8.77) (20.34) (15.93)
(-) (15.06)
(2.99)
(7.91) (13.09)
(-)
(16.19)
19 33.2 30 34.8 38 39.8
2 46.0 17 53.4
F(3, 100) = 10.6 A<E;B<E;C<E
19 2.0 30 2.7 38 8.5
2 2.0
17 11.7
MLU
A: control
B: SLI-good outcome C: SLI-poor outcome D: GD-good outcome E: GD-poor outcome
Naming Vocabulary
A: control
B: SLI-good outcome C: SLI-poor outcome D: GD-good outcome E: GD-poor outcome
Verbal Comprehension A: control
B: SLI-good outcome C: SLI-poor outcome D: GD-good outcome E: GD-poor outcome
Action Picture Information A: control
B: SLI-good outcome
C: SLI-poor outcome
D: GD-good outcome E: GD-poor outcome
dren, Pearson correlation coefficients were computed to assess the strength of
relationship for scores obtained be- tween each pair of sessions. Results are shown in
Table 3.
On the majority of measures, at least 30% of the vari- ance in scores on the final test
session can be accounted for in terms of scores obtained in the first session.
23 6.4 30 5.6 37 3.8
2 4.3 17 3.3
(1.77) (1.82) (IAD
(-) (1.36)
19 6.4 30 7.8 38 5.3
2 5.9 17 4.5
4 years 41/2years 51/2years
N M SD N M SD N M SD
(5.68) (12.04) (I0.13)
(-) (19.7)
(3.25) (3.84) (6.83)
(-)
(11.62)
(1.29) (1.89) (1.76)
(-) (1.84)
(10.25) (8.68) (9.92)
(-) (13.47)
(8.92) (1o.96)
(-) (8.77)
(2.77) (2.55) (4.24)
(-) (3.76)
Discriminant function analysis was used to see how well the 4-year-old variables could
predict status at 5V2 years for individual children. This procedure, which is clearly
described by Armitage (1971), is designed to apply to situations where subjects can be
divided into two groups and where, for each individual, measurements are avail-
F(3, 103) = 17.3
A<C;A<E;B<E A<C;A<E A<E;B<C;B<E
25 6.2 30 4.4 37 2.7
2 2.7 17 2.4
F(3, 105) = 51.1 A>B;A>C;A>E
(1.13) (1.49) (1.15)
(-) (0.99)
F(3, 102) = 15.8
F(3, 100) = 10.6
2.1
F(3, 103) = 19.8
A>C;A>E;B>C A>C;A>E;B>C
B>C;B>E B>E -B>E
23 75.2 29 69.1 38 55.3
2 67.5 17 50.6
F(3, 103) = 24.9 A>C;A>E;B>C
(11.36) (10.41) (11.94) (-) (10.44)
23 81.9 30 75.6 37 63.6
2 59.5
17 57.4
F(3, 103) = 23.6
(14.00) (7.78) (9.46)
(-) (13.54)
19 90.0 30 86.0 38 75.4
2 76.5 17 70.0
22 90.5 30 85.3 38 76.2
2 77.0 16 68.6
(10.15) (10.54) (14.38) (-) (10.53)
23 94.0 30 93.3 38 82.6
2 100.5 17 77.1
(10.18) (10.21) (10.81)
(-) (8.19)
30 99.2 38 90.0 2 100.5 17 87.7
F(3, 100) = 17.2 A>C;A>E;B>C A>C;A>E;B>C
B>E B>E B>E
F(3, 104) = 15.3
A>C;A>E;B>C A>C;A>E;B>C B>C;B>E
F(3, I02) = 13.6
F(2, 82) = 10.1
25 22.6 30 19.0 37 14.0
2 12.5 17 9.0
(3.63) (5.24) (5.38)
(-) (5.29)
23 25.4 30 24.6 38 18.9
2 17.0 17 15.9
(3.45) (3.78) (5.11)
(-) (5.61)
19 25.7 30 26.8 38 23.5
2 27.5 17 20.6
B>E B>E
F(3, 104) = 23.1
B>E;C>E B>E C>E
F(3, 105) = 31.0
A>C;A>E;B>C A>C;A>E;B>C A>E;B>C;B>E
F(3, I00) = 17.2
F(3, 100) = 13.2

TABLE2. (continued)
Action Picture Grammar A: control
B: SLI-good outcome C: SLI-poor outcome D: GD-good outcome E: GD-poor outcome
Bus Story Information A: control
B: SLI-good outcome C: SLI-poor outcome D: GD-good outcome E: GD-poor outcome
TROG centile
B: SLI-good outcome C: SLI-poor outcome D: GD-good outcome E: GD-poor outcome
BPVS centile
B: SLI-good outcome C: SLI-poor outcome D: GD-good outcome E: GD-poor outcome
% consonants correct A: control
B: SLI-good outcome C: SLI-poor outcome D: GD-good outcome E: GD-poor outcome
25 23.1 30 14.0 37 6.1
2 6.0 17 5.0
(5.41) (6.85) (4.60)
(-) (5.30)
23 24.8 30 20.1 38 11.6
2 15.5 17 8.9
(4.96) (7.10) (5.95)
(-) (5.67)
19 25.0 30 26.4 38 19.0
2 25.0 17 14.9
(3.54) (4.20) (6.18)
(-) (6.02)
(6.87) (6.06) (6.10)
(-) (4.62)
(26.69) (21.13)
(-) (9.52)
(20.30) (18.64)
(-) (10.55)
(4.33) (16.30) (-) (16.74)
(8.22) (12.88) (12.17)
(-) (10.03)
12-subtest Leiter raw/Draw-a-Man
A: control
B: SLI-good outcome C: SLI-poor outcome D: GD-good outcome E: GD-poor outcome
(1.69) (1.3) (1.57)
(-) (1.23)
19 89.2 30 95.5 38 87.8
2 86.0 16 78.1
F(3, 99) = 8.2
25 16.7 30 13.7 35 6.5
2 6.8 16 4.3
(6.19) (5.87) (3.62)
(-) (3.70)
23 17.4 30 19.0 38 10.7
2 15.0 17 7.1
(7.88) (7.64) (5.22)
(-) (5.65)
19 21.1 29 25.0 38 16.5
2 23.5 16 12.6
F(2, 31) = 1.9
F(2, 77) = 15.1
F(2, 82) = 15.4
23 93.6 30 65.1 38 55.9
2 34.0 17 54.1
(3.55) (19.68) (20.71)
(-) (21.78)
--
BISHOP & EDMUNDSON: Transient and Persistent Language Impairment 163
4 years 41/2years 51/2years
N M SD N M SD N M SD
F(3, 105) = 56.9
A>B;A>C;A>E A>C;A>E;B>C A>C;A>E;B>C
F(3, 104) = 35.0
B>C;B>E B>E B>E
F(3, 102) = 32.4
A>C;A>E;B>C A>C;A>E;B>C A>E;B>C;B>E
17 23.9 13 14.3 2 10.0 4 5.0
28 36.0 31 17.9 2 9.5 14 9.4
F(3, 104) = 16.9 B>E B>E
F(3, 98) = 18.4
(23.37) (16.61) (-)
(4.08)
n.s. B>C;B>E B>C;B>E
(21.70) (13.48) (-) (12.79)
36.0 (19.05) 15.6 (15.43) 11.5 (-) 13.4 (17.37)
30 41.4 38 16.4
2 22.0 17 8.6
30 41.9 (23.16) 34 20.5 (21.59)
30 35 2 16
30 40.0 38 18.4
2 50.0 (-)
16 8.9 (13.81)
2 30.0 17 5.7
F(2, 80) = 15.3
B>C;B>E B>C;B>E B>C;B>E
F(2, 78) = 14.2
F(2, 82) = 23.6
30 76.1 (16.64) 38 64.1 (19.96)
2 45.5 (-)
17 59.5 (26.64)
30 92.8 38 79.5 2 89.0 17 74.7
F(2, 82) = 4.5
F(3, 104) = 23.7
A>B;A>C;A>E B>E B>C;B>E
scaled
20 9.9 30 10.0 38 8.9
2 5.5 17 4.4
F(3, I01) = 60.9
A>E;B>E;C>E A>E;B>E
Note. Data for language-impaired groups are longitudinal, whereas those for control
children are cross-sectional.
aF ratios are significant at .05 level except where marked n.s. (not significant). Specific
comparisons from Scheff6 test (.05 level) on one-way ANOVA comparing all groups
with data except Group D.
able on a set of independent variables. The aim is to find the best linear weighted
combination of these variables to discriminate between the two groups, so that future
individuals can be allocated to one group or the other depending on whether their
summed weighted score on the set of variables falls above or below a entoffpoint, z0.
In the first analysis, data from children with general
delay were excluded. Missing data were estimated using one of the procedures
discussed by Lachenbruch (1975), that of calculating means from all available sample
values and substituting these for missing values. Data from TROG were excluded from
the analysis, because the majority of children had not been given this test. When all
other 4-year-old variables were entered into the equation,
F(3, 100) = 23.6
F(2, 82) = 12.5

164 Journal of Speech and Hearing Disorders


52 156-173 May 1987
TABLE3. Correlations between test scores obtained by language- impaired children at
three ages.
the good outcome category. They included the 2 children with impaired TROG scores
described above plus 3 additional children who scored as impaired only on percentage
consonants correct at 5Vz years, with scores ranging from 77% to 83%.
Because several of the language tests were highly intercorrelated (see Table 5), the
question arises as to whether outcome could be predicted on the basis of a reduced
subset of tests. For the whole sample, it was found that discriminant function analysis
using the four tests--Bus Story, Naming Vocabulary, Action Picture Grammar, and
Leiter--predicted outcome correctly for 86% of children. Indeed, using the Bus Story
Test as sole predictor, 83% of children were correctly classified.
We carried out a more detailed analysis to gain a clearer picture of the nature of
children's failures on the Bus Story Test. The distribution of 4-year-old Bus Story scores
for good and poor outcome groups is shown in Table 6, and it can be seen that a score
of 6 or less is strongly predictive of poor outcome. Points on this test are awarded for
predefined "ideas" that involve expressing semantic relationships. Clearly if the child
says very little, only produces single words, or is largely unintelli- gible, then a low
score is inevitable. Such explanations could account for about half the low scores. Of
37 chil- dren with low scores at 4 years, 10 were highly unintel- ligible, 3 were reticent,
and 5 produced only single-word utterances. For the remainder we need to seek other
explanations. The story does not place a heavy load on the child's memory because the
sequence of descriptive pictures is re-presented as the child retells the story. Indeed, it
is possible to earn some points by merely describing what is occurring in the pictures.
We noted, however, that many children who did very poorly on this test simply
repeated one or two striking details from the story. Some children made comments
during story pre- sentation that indicated that they failed to appreciate that the series
of pictures each depicted the same bus, and many poor scorers appeared to have little
idea of the story sequence and would mention salient events in the wrong
Test
MLU
Naming Vocabulary Verbal Comprehension Action P Information Action P Grammar
Bus Story Information TROG
BPVS
% consonants correct % unintelligible
4/41/2 years 41/2/51/2years
4/51/2years
.791 .750 .673 .682 .688 .566 .743 .732 .528
.662 .661
.800 .733 .638 .788 .797 .760
- .611 -
.636 -
.835 .651
.605 .450 .357
-
Note. Correlations given for all tests where N ~ 80. All values in this table are
significant at 1% level.
outcome status was correctly predicted in 60 of 68 chil- dren (88%). Results are shown
in Table 4.
For each child an overall score is computed by multi- plying the raw score on a variable
by its weighting and summing these. Children whose overall score falls below the
cutoff are predicted as having poor outcome, and those whose scores are above cutoff
are predicted as having good outcome. The generalized distance is an index of
efficiency of discrimination (see Armitage, 1971). In only two cases was good outcome
predicted for chil- dren who obtained a poor outcome. Both of these chil- dren scored
well above the 10th centile on eight tests at 5V2 years but were included in the poor
outcome group because they scored below the 3rd centile on TROG.
The discriminant function analysis was then repeated for the whole sample, including
children with general delay. Results are shown in Table 4. Outcome was cor- rectly
predicted for 78 of 87 children (90%). In five cases a poor outcome was obtained for a
child for whom good outcome was predicted, but, as with the previous analy- sis,
these misclassifications were not serious insofar as all these children only narrowly
missed being included in
.605
.504
TABLE4. Discriminant function analysis: Predicting outcome from 4-year scores.
Variable
Leiter
MLU
Naming Vocab Verbal Comp
Action P Information Action P Grammar Bus Story
BPVS centile
% consonants corr.
Cutoff: z0 Generalized distance Correctly classified
SLI only: all variables
Weighting
8.9974 -2.7036 1.5041 -0.6566 -2.7923 3.7952 3.8166 0.1783 0.1453
160.52 2.11
88%
Whole sample: all variables
Weighting
5.1613 2.0046 1.3610
-0.5881 -2.1301 2.1319 2.7726 0.0384 -0.1063
97.12 2.10
90%
Whole sample: best four variables
Weighting
3.8459 - 0.8707 -
-
1.3692 1.8204 -
-
117.35 1.96
86%
Whole sample: best single variable
Weighting
------1--
9.55
1.58 83%

Score
Outcome Good Poor
BISHOP & EDMUNDSON: Transient and Persistent Language Impairment 165 TABLE5.
Correlations between tests at 4 years (Ns in parentheses).
Leit MLU NVoc VCom API APG Bus BPVS %con %un
MLU .312" . . . . . . . . . (87)
NVoc .312" .510" . . . . . . . . (86) (85)
VCom .315" .387* .706* (87) (86) (85) API .460* .669* .627* (87) (87) (85)
APG .284* .853* .596* (87) (87) (85) Bus .447* .727* .662* (84) (84) (82)
BPVS .502* .537* .539* (76) (76) (76)
%con .127 .437* .143 (88) (87) (86) %un -.230 -.426* -.513' (85) (85) (83)
utt -.088 -.098 -.201 (85) (85) (83)
.......
.437* (86) .677* (84) .690* (76)
- . 0 3 1 (87)
-.403* (85)
...
. ....
.717" .... (84) (84)
.515" . . . (86)
.775* (87) .733*
.556* (76)
.198
(87) (87) (84)
-.499* -.480* -.486* (84) (85) (83)
-.319" -.136 -.148 -.218
(85)
(84) (85) (83)
Note. Key: Leit = Leiter; NVoc = Naming Vocabulary; VCom = Verbal Comprehension;
API = Action Picture Information; APG = Action Picture Grammar; Bus = Bus Story; BPVS
= BPVS centile; %con = % consonants correct; %un = % unintelligible utterances; utt =
total number of utterances.
*Correlation significant at 1% level.
order, despite the presence of the pictures, which should have aided ordered recall.
Relationship of Outcome to Therap!¢
Given the massive proportion of the variance in out- come that is accounted for by
initial language status, it is perhaps not surprising that we found no improvement in
prediction when rating of amount of therapy received between first and last session
was included in the analy- sis. One cannot conclude from this that therapy has no
effect on outcome, because therapy was not assigned randomly and was more likely to
be given to children with more severe language impairment. For children with good
outcome (and relatively good initial status), during the 18-month period of study 31%
had no therapy, 19% had less than 11 hr of therapy, 31% had more than 11
TABLE 6. Distribution of Bus Story scores at 4 years (whole sample).
hr of therapy, and 19% had spent at least 2 months in a language unit with specialized
teaching and intensive therapy. For children with poor outcome, 13% had no therapy,
15% had less than 11 hr of therapy, 26% had more than 11 hr of therapy, and 47% had
spent at least 2 months in a language unit. It could be that the most severely impaired
children would have clone even more poorly had they not received any therapy. All we
can say about the effects of therapy from this study is that in a sample such as this
where there is a wide range in severity of disorder, any positive effects of therapy on
outcome are swamped by differences due to initial status.
Observed Patterns of Disorder at 4 Years and Their Relationship to Outcome
In order to classify children by pattern of disorder, language measures were grouped
into four broad catego- ries-phonology (expressive), syntax and morphology (ex-
pressive), semantics (expressive), and language compre- hension (see under
Procedure, above). Performance in each of these areas was rated as impaired (i.e.,
perform- ante on any test in that area in impaired range according to Table 1 criteria)
or unimpaired. Four children whose scores did not fall into the impaired range on any
test were excluded. By rating performance in each of these areas as impaired or
unimpaired one obtains 15 possible patterns of language disability, but 3 of these
patterns were never observed, and 6 patterns accounted for the data from 75 of 83
children (90%). The 4 most common patterns, which together account for 71% of
children, are
0-3 1 12 4--6 1 24 7-9 4 7
10-12 13 5 13-15 4 1 16--18 4 2 19+ 5 0
.518" .674* - - - (76) (74)
.365* .157
.054 - - (76)
-.331' -.460* - (75) (85)
-.283 .008 .250 (75) (85) (85)

166 Journal of Speech and Hearing Disorders


those predicted by the model shown in Figure 1. Pattern of disability relative to
outcome is shown in Table 7.
To see how pattern of disability related to outcome, data from Table 7 were collapsed
into a 5 x 2 table with two outcome groups, good and poor (specific language
impairment and general delay combined), and five pat- terns of disability:
Pattern I: Pure phonological impairment; 78% of these children had good outcome.
Pattern III: Impairment restricted to phonology and syntax and morphology; 56% of
this group had good outcome.
Pattern VII: All expressive functions impaired; 13% of this group had good outcome.
Pattern XV: Receptive-expressive disorder with im- pairment in all four areas of
language function; 14% of this group had good outcome.
Other: All other patterns of impairment (see Table 7); 33% had good outcome.
Frequencies of outcome types were significantly dif- ferent for these language patterns
(×2(4) = 14.38; p < .01). Although frequencies were too small for specific compar-
isons, it is clear that the data are inconsistent with the hypothesis that a selective
impairment of particular lan- guage functions has a poor prognosis. On the contrary,
the poorest outcomes were found in children whose language was impaired on a wide
range of measures, whereas the best outcome occurred in children with an isolated
pho- nological impairment.
In a further analysis, we examined the stability of patterns of language impairment
over time. Data from the second test session, when children were aged 41/2years,
were categorized into language patterns as those from the first session had been,
using the criteria for impairment shown in Table 1. Table 8 shows the relationship be-
TABLE7. Frequency of patterns of language impairment relative to outcome.
52 156-173 May 1987 TABLE8. Patterns of impairment at 4 and 4~/zyears.
Specific Lang Impairment
Outcome
Pattern PHON SYN SEM COMP Good Poor Good Poor
Ix 79.00
II x 3 4 0 1 III x x 9 7 0 0 IV x 1100
VI
VII x
X
XI x x x 1 0 0 0 XII xx0001
Unimpaired 5 4 1 1 6
I 44200
III 07600 VII 01841 XV 00161 Other 0 0 5 3 13
tween patterns of impairment observed in the two ses- sions. Because of small
expected frequencies, it was necessary to combine Categories I and III, and VII and XV
before testing the significance of the association between pattern of impairment on
the two oecasions. The association was highly statistically significant (X2(6) = 56.28; p <
.001). For children with Initial Patterns I, III, VII, and XV, 25 of 62 children (40%)
maintained the same pattern of impairment across the two sessions, 19 (31%) moved
to the next pattern up the series, 8 (13%) moved two or more positions up the series,
and the remaining 10 children (16%) moved to the "other" category or moved to a
pattern lower down the series. The majority of the 21 children who were initially in the
"other" category either remained in this category" (62%) or improved to the
"unimpaired" category (29%).
When changes in pattern occur, are they random or systernatie? Hypothesis C
maintains that Patterns I, III, VII, and XV correspond to different levels of severity in a
single underlying disorder and so predicts that as chil- dren improve they will tend to
move to the next category in the series. This leads to four specifc predictions of
associations between initial and final pattern (i.e., Initial Pattern I will be associated
with final unimpairment, Initial Pattern III will be associated with Final Pattern I, Initial
Pattern VII will be associated with Final Pattern III, and Initial Pattern XV will be
associated with Final Pattern VII). Each prediction was tested by recasting the data
from all children who changed category in a two-way table and doing a Fisher exact
test. Thus, for example, in the first test, the proportion of children in Initial Group I
who were finally unimpaired (5 of 5) was compared to the proportion of all other
ehildren who were finally unimpaired (12 of 40) after excluding all children who did
not change category. All four predicted associations were signifieant at the .01 level.
A second prediction from this model is that Groups I, III, VII, and XV should follow a
regular order in terms of severity of impairment in a particular area. For example, by
definition we know that syntax and morphology are impaired in Groups III, VII, and XV,
but the model predicts that the severity of impairment should increase as one goes
down the series. The relevant data are shown in Table 9.
The scores decrease as expected going from Pattern I to Pattern XV for Naming
Vocabulary, Verbal Comprehension, Action Picture Information, Bus Story, and BPVS.
On the
XIII x XIV
XV x
xx 0101 x x 1 15 2 5
0
xx0100 xxx1103 x x x 2 6 0 6
xx100
Note. x = impaired; . = unimpaired. PHON: percentage conso- nants correct; SYN:MLU
and/or Action Picture Grammar; SEM: Naming Vocabulary, Action Picture Information,
or Bus Story; COMP: BAS Verbal Comprehension or BPVS.
General Delay
Outcome
41/2-yearpattern
I III
4-year pattern VII XV
Other

BISHOP & EDMUNDSON: Transient and Persistent Language Impairment 167 TABLE9.
Mean scores (and SD) at 4 years relative to pattern of impairment.
Group I Group IiI Group VII Test n= 9 n=17 n=23
Group XV n= 14
MLU
Naming Vocab 71.6 (12.26) 66.7 (8.49) 56.2
(0.73) 2.3 (9.50) 48.4 (8.48) 63.1 (3.71) 8.6 (3.15) 3.9 (2,70) 3.3
(12.11) 5.2 (17.96) 52.1
(0.97i (7.04) (7.09) (3.27) (3.41) (i.92) (3.85)
(12.13)
Verbal C0mp 87.9 Action P Info 24.3 Action P Gram 18.9 Bus Story 16.4 BPVS centile
36.9 % cons correct 59.4
(8.19) 85.7 (3.77) 18.7 (4.43) 9.1 (5.48) 10.5
(15.64) 36.5 (14.12) 48,1
(7.37) 80.9 (2.34) 10.9 (4.17) 3.7 (2.07) 5.7
(19.82) 16.6 (15.52) 42.6
4.9 (0.53) 3.3 (0.62) 2.1
Note. Groups as defined in Table 7: I--pure phonological impairment, III--phonologic-
syntactic impairment, VII--general expressive impairment, XV--expressive and
receptive impairment.
two comprehension tests, scores for Groups I and III are very close, but this could
reflect a ceiling effect because these scores are also similar to control levels (see Table
2). On MLU and Action Picture Grammar it was predicted that the scores for Group VII
would be higher than those of Group XV, and this was not found. However, both
groups obtained scores close to the minimum possible level. If one excludes possible
floor and ceiling effects, data On measures of syntax and morphology, semantics, and
comprehension do appear to agree with the model in Figure 1. The one measure that
does not give results in accord with the model is percentage consonants correct,
where all four groups perform at a comparable low level.
DISCUSSION
Characteristics of Language-Impaired Sample at Initial and Final Assessment
The first question we considered was what range of problems are found among
children referred for profes- sional help because of concern about language develop-
ment at 4 years of age. Although we had asked referring agencies to exclude children
of low nonverbal ability, 22% of our sample scored more than two standard devia-
tions below the mean on our nonverbal test. Some of these children had been
assessed using an instrument other than the Leiter and were regarded as having intel-
ligence within normal limits, whereas others had not had any previous intellectual
assessment. For most of our analyses we kept this Subgroup of children Separate from
the remainder of the sample, but in practice we found no evidence of qualitative
differences between these and other children. Quantitative differences were pro-
nounced, however, and we found that very few of these children were rated as having
good outcome at final follow-up. When slow language development occurs in the
context of a low nonverbal test score, this can be regarded as a poor prognostic sign.
The converse does not follow, however, in that many children with good non- verbal
scores had poor language outcome.
For children with nonverbal ability in the normal range, good outcome was more
common, being found in
30 of 68 children (44%). As this is a relatively high proportion compared to some other
studies, one might wonder whether many children had trivial problems to start with.
However, if we adopt the same criterion for classifying children at 4 years (i.e., "good"
status defined as having no language score in the impaired range and no more than
one score below the satisfactory range), then only 2 children are classified as
unimpaired at first assess- ment. Thus, using comparable criteria at 4 years and 5V~
Years, we find that many children change from poor to good. Furthermore, over two
thirds of children in the good outcome group had been regarded as in need of speech
therapy during the course of the study. HOwever, the severity of impairments in our
sample was undoubt- edly less extreme than in some other studies. The vast majority
of children in this study showed no autistic features (cf. Allen & Rapin, 1980; Paul,
Cohen, & Caparulo, 1983). An informal note had been made of any behavioral
pecularities during assessment. All children made eye contact on first assessment, and
in only three cases did the examiner feel she had failed to overcome initial shyness and
establish reasonable rapport. Three children showed occasional echolalia. The only
behav- ioral features noted with any frequency in this sample were short attention
span and distractibility. Further- more, no child was totally mute on first assessment,
although 3 children produced little more than unintelli- gible single words and
symbolic noise. The analysis of data from Table 2 confirms that in general children's
low scores on expressive tests did not result from muteness or reticence. Indeed, the
only significant difference be- tween a language-impaired group and the control group
in terms of number of utterances is in favor of the language-impaired children,
perhaps because they at- tempt to convey one basic message in several short
utterances rather than in one complex sentence.
A second question one might ask is whether good out- come is aptly defined. It could
be that despite fulfilling the criteria used here, children in the good outcome group do
still have significant problems relative to their peer group. However, analysis of data in
Table 2 discounts this possi- bility. Just by inspection one can see that the good
outcome group are scoring well within the normal range.

168 Journal of Speech and Hearing Disorders Relationship of Outcome to Severity of


Disorder
at 4 Years
The discriminant function analysis confirmed that out- come status is strongly related
to level of performance at 4 years, but strength of prediction depends on which test
we consider. Severity of phonological impairment at 4 years did not differentiate
children with good and poor outcome, whereas other language measures, especially
those measuring expressive semantic ability, are suffi- ciently strongly related to
outcome to be useful in giving a prognosis for an individual child.
Observed Patterns of Disorder at 4 Years and Their Relationship to Outcome
We described three hypotheses that make predictions about expected patterns of
language disorder and their relationship to outcome.
We found no support for Hypothesis A, which pre- dicted two distinct subgroups of
impairment: one consist- ing of an even pattern of delay in all language functions with
a good prognosis and the other consisting of isolated impairments in specific functions
with a poor prognosis. Our data indicated that the relationship between outcome and
pattern of impairment was the opposite: Isolated impairments were more likely to
have good outcome than more pervasive problems.
Hypothesis B maintains that there are several diverse types of language impairment,
each with its own progno- sis, and that the pattern of impairment will be reasonably
stable from one test session to the next. Although many children did maintain the
same pattern of impairment across sessions, there were many instances of category
change, and, more importantly, these were systematic and not random, suggesting
that different patterns of impairment did form a related continuum and were not
distinct conditions.
Hypothesis C makes several distinct predictions with which our data were largely
consistent. This hypothesis maintains that there is a hierarchy of vulnerability of
language functions. Our data offered support for this view, in that the most frequently
observed patterns of impairment formed a graded series with increasing num- bers of
impaired functions being added as one descended the series. The order of vulnerability
of language func- tions agreed with that predicted from the specific model shown in
Figure 1. Furthermore, there was a clear rela- tionship between severity of language
impairment and number of functions impaired. This was reflected both in the data on
outcome, which indicated the best outcome for those with fewest functions impaired,
and, to a more limited extent, in the data on severity of impairment within any
language function, with the exception of phonology, the severity of impairment in any
one area was related to the number of areas that were impaired. Finally, patterns of
language impairment changed as children improved, and these changes were
systematic
52 156--173 May 1987
and largely consistent with the model shown in Figure 1. Nevertheless, we did observe
patterns of impairment that were inconsistent with the model, and it should not be
concluded that all cases of specific language impairment can be accounted for by a
model of a unitary disorder.
General Discussion and Conclusions
Our data support the idea that for many preschool language-impaired children the
term language delay is an apt description, in that after a slow start these children do
make progress and eventually catch up with their peers.
Other longitudinal studies have reported variable rates of outcome, ranging from a low
of 7% unimpaired at follow-up (Stark et al., 1984) to a high of over 50% with normal
classroom placement and reading achievement at follow-up (Aram & Nation, 1980).
Such discrepancies probably result from methodological differences between studies.
First of all, the proportion of children with good outcome will depend on whether or
not the study in- cludes children of low nonverbal ability or with addi- tional
handicaps. Our study, like that of Stark and Tallal (1981), found that many children
who are referred as cases of specific language impairment do have more general
intellectual problems. In common with Aram et al. (1984), we found a strong
relationship between scores on the Leiter test and outcome: If we included only those
children with Leiter scores in the normal range, then 44% of our sample had a good
outcome, whereas for the subset of children with Leiter scores more than two
standard deviations below the mean, only 11% had a good out- come. Studies that
select cases on the sole criterion of poor language development (e.g., Sheridan, 1973;
Silva, 1980; Silva, McGee, & Williams, 1983) are likely to include many children with
mild intellectual retardation. Fundudis et al. (1979) found that a group of children
selected purely on the basis of poor expressive language development at 3 years
contained a high proportion of cases who were also delayed in learning to walk. On
follow-up at 7-8 years, their sample was significantly impaired on a range of language
and educational tests, but outcome was noticeably poorer for the "late walkers," who
were also impaired on nonverbal tests, than for the "early walkers," who had more
selective verbal deficits. Likewise, Richman, Stevenson, and Graham (1982) found that
although poor scores on language tests at 3 years of age are highly predictive of later
educational problems, this is largely because the majority of low scorers are generally
retarded in all aspects of development. In contrast to other studies, they did not find a
raised frequency of specific reading difficulties in 8-year-old children with a history of
specific language delay at 3 years.
A second factor that will affect results is the age distri- bution ofthe sample, Had we
selected an initial sample of language-impaired children at 5 years of age, rather than
at 4 years of age, we would have excluded many of the good outcome children from
the sample because they would no longer have been regarded as language ira-

paired by that age. Given that there is a strong relation- ship between severity of initial
disorder and outcome, the older children are when selected for study, the more severe
their disorder will be. Thus, we can expect the proportion of children with a good
outcome to be much lower in studies using older samples. This is consistent with
comments by Stark et al. (1984) who followed up 29 language-impaired children aged
4-8 years. Four years after initial contact, 6 of these children (21%) no longer fulfilled
the criteria for language impairment, and it was noted that these tended to be those
children aged 6Y2 years or less at time of initial contact. Similarly, the few children
studied by Aram et al. (1984) who were in regular classes at follow-up in adolescence
tended to be those who had been identified as language-impaired below the age of
4~/2years. The study by Aram and Nation (1980) that reported unusually good
outcome in language- impaired children investigated a group with mean age of 32
mouths at diagnosis.
Quite apart from these factors, outcome will depend on the severity of initia! language
impairment. Studies that concentrate on children in special education or those
referred for neurological investigation (e.g., Allen & Rapin, 1980; Grifflths, 1969; Paul
et al., 1983) will include a relatively high proportion of severe impairments and so tend
to find a poorer outcome on average than we did. Although some of the ehildren in
our study were seen as requiring intensive, specialized teaching, we also in- cluded
many cases who had been identified as having poor language development by a
speech-language pathol- ogist but who were either treated on a community basis or
who were placed "on review" without regular therapy.
Our study followed up children only to the age of 5~'2 years. We have drawn attention
to the fact that a high proportion of these children were indistinguishable from control
children on verbal measures by this age and so appeared to have recovered
completely from early lan- guage delay. However, it may be that these children remain
at risk for other problems. The language skills in which "recovery" is most pronounced
are those that are normally mastered relatively young; thus, test perform- ance on
measures of syntax and phonology starts to reach an asymptote after 4 or 5 years in
normal children. One could argue that because the normal child is not moving ahead
further in these skills, the language-impaired child has a chance to catch up. Perhaps,
however, if we used more sensitive measures of phonological or syntactic skills (e.g., if
we looked at speed of articulation of com- plex words), then deficits would still be
apparent, even in the good outcome group. Also, we may find that in other areas of
verbal skill where development is usually rapid after 5 years, impairment again
becomes apparent in this group. A reliable finding from other longitudinal studies is
that children who are identified as having language problems frequently go on to have
difficulties learning to read and write (Aram & Nation, 1980; Fundudis et al., 1979;
Mason, 1967; Stark et al., 1984). Perhaps even those chi!dren in our sample who are
indistinguishable from controls at final assessment are at risk for later educational
difficulties. Stark et al. (1984) reported that of the six
children in their study who appeared to have recovered from early language
impairment, only two were reading appropriately for their age. However, likelihood of
read- ing problems in a language-impaired child does appear to depend on the type of
impairment, so that children with isolated phonological disorders are far less likely to
have reading problems than those with semantic and syntactic difficulties (Debray-
Rit_zen, Mattlinger, & Chapuis, 1976; Levi, Capozzi, Fabrizi, & Sechi, 1982). We aim to
study our sample further to discover if reading problems de- velop even in children
who have apparently recovered from early language delay by the time they start
school or in those who have only residual phonological problems.
The second question that concerned us was whether one can predict outcome for
individual children on the basis of information about language status in the pre- school
period. Aram et al. (1984) went some way toward answering this question, reporting
significant correlations between some language and nonverbal measures ob- tained in
the preschool period and subsequent outcome. However, their sample size was fairly
small (20 children) and heterogeneous with respect to both age (ranging from 3Y2 to
just below 7 years) and nonverbal ability, and although correlations between initial
status and outcome are highly significant for some measures, one could not use these
data as a basis for confident predictions about outcome in individual cases.
Our study, using a large sample with a narrow age range, showed that one can predict
with a high degree of accuracy the likelihood of good outcome for individual children.
The pessimism of Stark, Mellits, and Tallal (1983), who suggested that standardized
tests may be poor at predicting outcome for young language-impaired children, does
not seem to be warranted. We found that on the basis of a 1-hr language assessment
given at the age of 4 years, outcome could be correctly predicted in 90% of children,
and the few errors in prediction were not serious, insofar as they involved children
who fell close to the borderline dividing good and poor outcome. The analyses shown
in Table 4 could be applied in clinical settings to give a prognosis to a language-
disordered child in this age range. It is unfortunate that cultural factors limit the
practical applicability of our result; the test that gives the best prediction of outcome,
the Bus Story Test, would need to be modified to make it suitable for non- British
children. Nevertheless, our study does allow some general conclusions that may be
applied in practice even in situations where the particular assessments used here are
not feasible. First, severity of phonological impair- ment is not in itself a good
prognostic index. Second, the likelihood of persisting impairment is directly related to
the range of language functions that are impaired. A 4-year-old who is impaired only in
phonology is likely to be problem-free by 5~'2years, whereas a child whose language is
limited in content as well as structure has a poorer outlook, especially if
comprehension is also im- paired. Third, a child who at 4 years of age is unable to give
even a simplified account of a sequence of events in a story accompanied by pictures is
likely to have a poor outcome. Conversely, ability to relate the main events
BISHOP & EDMUNDSON: Transient and Persistent Language Impairment 169

170 Journal of Speech and Hearing Disorders


52 156-173 May 1987
from a story in the correct sequence is a good prognostic sign in a 4-year-old, even if
the syntax, morphology, and phonology used by the child are very immature.
At first sight, there seems to be considerable conflict between our findings and those
from Schery's (1985) study of 718 children, which reported very little success in
predicting future progress from language or indeed other types of measures. The
conflict is, however, more apparent than real. Schery, in fact, found highly signifi- cant
correlations between initial language status and outcome measures, just as we did.
The problem for her study was that, given the age range of her sample, the
meaningfulness of this finding was questionable because scores were apparently not
age adjusted. In a sample ranging in age from 3 to 16 years, any age-dependent
variable measured at two points in time (e.g., height, shoe size) will give a high
correlation. Presumably for this reason, Schery aimed to predict, not final status, but
amount of improvement in scores from one occasion to the next. Elsewhere (Bishop &
Edmundson, in press) we conducted a similar analysis on our own data and found
results very similar to those of Schery (i.e., amount of change over a given interval was
not related to language variables nor to presumptive etiological factors). In con- trast
to Schery, we used a narrow age band of children, so correlations between initial
status and outcome are mean- ingful and can be used in prognosis. However, a disad-
vantage is that our conclusions have correspondingly narrow applicability (e.g., one
could not generalize them to 3-year-olds where inability to tell a story might have very
different significance).
We had predicted that a selective deficit in one area of language functioning might
have a worse prognosis than a more even pattern of impairment, but the converse was
found. Furthermore, highly accurate prediction of out- come could be obtained by
combining data from children with normal nonverbal scores and from those who were
impaired on nonverbal measures also, suggesting that determinants of language
outcome were similar in the two subsets of children.
The only area of language functioning where outcome did not relate strongly to
severity of impairment was phonology. It may be that the nature of phonological
processes used by the child is a better predictor of outcome than the overall level of
functioning. The child who uses processes that are common in normal develop- ment
may have a better prognosis than one who uses atypical, "deviant" processes. We
intend to carry out a detailed analysis of phonological data to analyze this question.
In this respect also our results are in apparent conflict with another study. Aram et all
(1984) found that a measure of expressive phonology was a good predictor of later
language and reading outcome. Here again, the explanation may relate to differences
in age constitution of samples. Because most children have near perfect mastery of
phonology by 4 years, phonological status is typically assessed in terms of a raw
measure that is not age adjusted because it is not feasible to use standardized scores
as with other language skills. Nevertheless, a child
with a given low score at 5 or 6 years has a far more serious problem than one with the
same score at 4 years, insofar as statistically this is much more unusual. Aram et al.
included children aged up to 6 years 11 months in their study, and it may be that
phonological competence does start to have predictive significance in this age range.
We have stressed the importance of age in studies of language-impaired children. Our
own experience is that the pattern and severity of impairment can change dra-
matically in language-impaired children, even over a relatively brief interval such as
the 18-month period used here. Longitudinal studies and those concerned with
classification or treatment effects often use samples with a wide age range. Although
understanding the difficulties in finding adequate numbers of subjects of similar age,
we feel that such a procedure introduces substantial variance that is likely to be large
enough to mask other effects of interest, and we would recommend that by restricting
consideration to a narrow age band one may find far more regularity and order in data
from language- impaired samples than might have seemed possible.
ACKNOWLEDGMENTS
This work would not have been possible without the generous support of Dr. Errington
Ellis and the staff of the Child Devel- opment Centre, Royal Victoria Infirmary,
Newcastle-upon-Tyne. We would also like to acknowledge all the t~,elp given to us by
speech-language pathologists in Sunderland, South Tyneside, North Tyneside,
Northumberland, Durham, Newcastle, Salford, Trafford, Central and North
Manchester, and Preston and by Dr. Ian McKinlay and the staff of the Holzel Centre,
Booth Hall Hospital, Manchester. Thanks are also due to staffand parents at North
Shields Nursery, Monkchester Road Nursery, Sir James Knott Nursery, Sandyford
Nursery, Raby St Infants School, Meadow Well Primary School, Hedworthfield Primary
School, Welbeck Road Infants School, and Hillsview School. This study was supported
by a project grant from British Medical Research Council.

You might also like