(Kees - de - Bot, - Ralph - B. - Ginsberg, - Claire - Kramsch) - Foreign Language Research in Cross-Cultural Perspective PDF

FOREIGN LANGUAGE RESEARCH IN
CROSS-CULTURAL PERSPECTIVE
STUDIES IN BILINGUALISM (SiBil)
EDITORS
KEES DE BOT THOM HUEBNER

University of Nijmegen San José State University
EDITORIAL BOARD
Michael Clyne (Monash University)

Theo van Els (University of Nijmegen)
Charles Ferguson (Stanford University)
Joshua Fishman (Yeshiva University)
François Grosjean (Université de Neuchâtel)
Wolfgang Klein (Max Planck Institut für Psycholinguistik)
Christina Bratt Paulston (University of Pittsburgh)
Suzanne Romaine (Merton College, Oxford)
Charlene Sato (University of Hawaii at Manoa)
Merrill Swain (Ontario Institute on Research in Education)
Richard Tucker (Center for Applied Linguistics, Washington)
Volume 2
Kees de Bot, Ralph B. Ginsberg and Claire Kramsch (eds)
Foreign Language Research in Cross-Cultural Perspective

FOREIGN
LANGUAGE RESEARCH IN
CROSS-CULTURAL PERSPECTIVE
edited by
KEES DE BOT
University of Nijmegen
RALPH B. GINSBERG
University of Pennsylvania
&
National Foreign Language Center
CLAIRE KRAMSCH
University of California, Berkeley
JOHN BENJAMINS PUBLISHING COMPANY

AMSTERDAM/PHILADELPHIA
1991
The publication of this volume has been supported by a subsidy from
the European Cultural Foundation.
The paper used in this publication meets the minimum requirements of

American National Standard for Information Sciences — Permanence of
Paper for Printed Library Materials, ANSI Z39.48-1984.
Library of Congress Cataloging-in-Publication Data
Foreign language research in cross-cultural perspective / edited by Kees De Bot, Claire

Kramsch & Ralph B. Ginsberg.
p. cm. - (Studies in bilingualism ; v. 2)
Includes bibliographical references.
1. Language and languages - Study and teaching ~ Research. I. De Bot, Kees. II.
Kramsch, Claire J. III. Ginsberg, Ralph B. IV. Series.
P53.F598 1991
418'.007--dc20 91-6804
ISBN 90 272 4113 9 (Eur.) / 1-55619-541-9 (US) (pb.; alk. paper) CIP
© Copyright 1991 - John Benjamins B.V.
Reprinted in paperback: 1994
No part of this book may be reproduced in any form, by print, photoprint, microfilm,
or any other means, without written permission from the publisher.
John Benjamins Publishing Co. • Amsteldijk 44 • P.O.Box 75577 • 1070 AN Amsterdam
• The Netherlands
John Benjamins North America • 821 Bethlehem Pike • P.O. Box 27519 • Philadelphia,
PA 19118 «USA
Table of Contents
Foreword ix
Richard D. Lambert
Preface xi
Kees de Bot, Claire Kramsch & Ralph B. Ginsberg
SECTION I - PRIORITIES IN THE US AND IN EUROPE 1

Foreign Language Instruction and Second Language Acquisition
Research in the United States 3
Charles A. Ferguson & Thorn Huebner
Empirical Foreign Language Research in Europe 21
Theo van Els, Kees de Bot & Bert Weltens
SECTION II - MEASUREMENT AND RESEARCH DESIGN 33

Introduction to the Section Measurement and Research Design 35
Ralph B. Ginsberg
Focus on Form: A Design Feature in Language Teaching
Methodology 39
Michael H. Long
Pros, Cons, and Limits to Quantitative Approaches in Foreign

Language Acquisition Research 53
W.E. Lambert
Vi TABLE OF CONTENTS
Ask A Stupid Question...: Testing Language Proficiency in the

Context of Research Studies 73
Christine Klein-Braley
Item Response Theory and Reduced Redundancy Techniques: Some
Notes on Recent Developments in Language Testing 95
Mats Oscarson
SECTION III -TEACHING ENVIRONMENTS 113

Introduction to the Section on Teaching Environments 115
Kees de Bot
Research on Language Teaching Methodologies: A Review of the
Past and an Agenda for the Future 119
Diane Larsen-Freeman
Problems in Defining Instructional Methodologies 133
Christopher Brumfit
Evaluation of Foreign Language Teaching Projects and
Programmes 145
Rosamond Mitchell
The Characterization of Teaching and Learning Environments:
Problems and Perspectives 161
DickAllwright
SECTION IV - LEARNING ENVIRONMENTS 175

Introduction to the Section Learning Environments 177
Claire Kramsch
Some Ins and Outs of Foreign Language Classroom Research 181
Willis ƒ. Edmondson
Linguistic Theory and Foreign Language Learning Environments 197
Suzanne Flynn
Culture in Language Learning: A View from the United States 217
Claire Kramsch
TABLE OF CONTENTS vii
Implications of Intelligent Tutoring Systems for Research and

Practice in Foreign Language Learning 241
Ralph B. Ginsberg
Foreword
Richard D. Lambert
The National Foreign Language Center was pleased to (co-)sponsor the

conference whose papers are herein reproduced. The theme of the conference
reflect two of the Center's primary goals: to encourage empirical research in
foreign language pedagogy and to link the language instructional community in
the United States more firmly with equivalent groups of scholars and teachers in
other countries. To this end, under the joint chairmanship of Prof. Theo van Els
of Nijmegen University and myself, leading scholars in Europe and the United
States who are concerned with empirical research on foreign language instruc-
tion met to exchange views.
The agenda for discussion included such issues as: priorities for research in
language pedagogy, effective forms of measurement, appropriate analytic
strategies and designs of proof, and the ways in which research results can im-
prove language learning and teaching. In addition to these essentially methodo-
logical issues, the conference attendees discussed three substantive topics: the
relationship between language and culture, the effectiveness of various teaching
methodologies, and differences in learning environments. The conference itself
was organized into the sessions presented hereafter. All of the papers for the
conference were distributed beforehand so that the discussion, led by the desig-
nated rapporteur, could proceed immediately to a more general level, and so
that the sessions could be made more cumulative as the days progressed.
A basic assumption of the conference was that the very different language
instructional systems on the two continents had given rise to quite separate and
diverse research traditions. As the discussions progressed there were several dif-
X RICHARD D. LAMBERT
ferences in orientation which did seem to distinguish the American from the Eu
ropean attendees. For instance, the Americans were more inclined to focus on
individual language learners, non-classroom learning environments, the role of
learner variation, and the importance of research on policy. The Europeans
were more inclined to take the classroom as given, treat teaching and learning
together, and give preference to research that would help the teacher perform
better. A number of the American scholars were particularly interested in the
languages of Asia and Africa while the Europeans were primarily concerned
with the teaching of English or the languages of Western Europe.
By and large, however, the differences of opinion that emerged in the dis
cussion did not follow nationality lines. Sharp disagreements there were, but it
was difficult to predict on which side of the Atlantic the contending parties or in
dividuals would fall. For instance, there was disagreement as to whether theore
tical significance was a necessary definer of research priorities or whether
evaluation of existing programs and the solution of concrete problems should be
paramount; whether the need for narrowly-defined, rigorously controlled ex
periments should take precedence over semi-ethnographic observation of natu
ral phenomenon; what the appropriate scale of studies is; whether the validity of
the measures of learning outcomes is still indeterminate or appropriate
measures are available once the goals of research are set; what the contribution
of theoretical linguistics is to research on second language learning is; how
should advanced technologies best be used; whether culture should be taught di
rectly or left to emerge as a by-product of language learning. These and many
other issues are represented in the papers that follow.
We wish to thank the European Cultural Foundation for sharing in the sup
port for this conference. The Rockefeller Foundation's Conference Center at
Bellagio, Italy, on Lake Como, is an ideal place for just such transnational dia
logues. The informal discussions that eddied around the edges of the formal
presentations were especially helpful in facilitating discussions across national
boundaries. We are especially thankful to the Rockefeller Foundation for its
hospitality and support.
5 January 1989
National Foreign Language Center
Washington, D.C.
Preface
Kees de Bot, Claire Kramsch & Ralph B. Ginsberg
In recent years research on foreign language teaching and learning has in
creased substantially on both sides of the Atlantic. At the same time differences
in perspectives on what should be investigated and what paradigm should pre
vail have also grown, to the point that there is serious concern, both in Europe
and in the United States, that a split in the field is not inconceivable. In an at
tempt to narrow this gap, Richard D. Lambert of the National Foreign Language
Center and Theo van Els of the Department of Applied Linguistics of the
University of Nijmegen, the Netherlands, took the initiative to organize a small
scale conference on empirical research in foreign language pedagogy, bringing
together scholars from Europe and the US. The aim of the conference was to
unearth both commonalities and differences in viewpoints and paradigms. The
conference was sponsored by the NFLC, the Rockefeller Foundation, and the
European Cultural Foundation, and was held at the Rockefeller Foundation's
Bellagio Study and Conference Center in June 1988. The present volume is the
outcome of this conference.
The editors are indebted to the authors for their cooperation through sev
eral revisions, to Albert Cox for technical support in producing the manuscript,
to the National Foreign Language Center for support in getting this volume to
press, and to Yola de Lusenet of Benjamins Publishers for her help and pa
tience.
Section I—Priorities in the US and in Europe
Foreign Language Instruction and Second Language
Acquisition Research in the United States
Charles A. Ferguson & Thorn Huebner
Foreign language (FL) instruction and the related research on second lan
guage acquisition (SLA) in the United States can be understood only in the con
text of the role of English, of American education, and of speech and language
research and educational research in the United States. Any part of an educa
tional system is, after all, both a result of historical processes and a response to
current needs and values.
1 The language situation in the United States
Five aspects of the language situation are relevant to an understanding of

FL teaching and learning in the United States: the dominance of English in
American life, the scarcity of FL instruction in U.S. public schools, the language
professions, FL instruction outside the public schools, and myths about language
held by Americans.
1.1 Dominance of English
The most salient part of the language situation in the United States is surely
the overall dominance of English. Not only is English by far the most common
mother tongue, it is also by far the language most often learned as a second lan
guage and is overwhelmingly the language of participation in U.S. economic,
4 CHARLES A. FERGUSON & THOM HUEBNER
political, and social life. Moreover, Americans perceive their nation as even
more monolingual than it is. In 1975, for example, when the U.S. Bureau of the
Census conducted a special sample survey of non-English languages, almost 18
percent of the population aged 14 years or older claimed a mother tongue other
than English (seven out of ten of them native-born Americans), and one person
out of eight aged four or older lived in a household in which a language other
than English was spoken (Waggoner 1981). Although not the national or official
language of the United States by constitution, statute, or regulation, English is
the de facto national language, its status maintained by powerful social press
ures, and non-English-speaking immigrant groups have generally experienced
relatively rapid attrition of mother tongue competence and corresponding shift
to English (Fishman et al. 1966; Veltman 1983). In spite of this pattern of lin
guistic assimilation, the visibility of large numbers of Hispanics and the relative
ly recent influx of Asians have resulted in movements advocating some kind of
legal status for English, both at state and national levels. The outcome of such
movements is unclear, but the dominance of English is likely to persist no mat
ter what the outcome.
1.2 FL Instruction in Public Schools

A complementary aspect of this English dominance is the very low in
cidence of FL instruction in the schools. Although education is basically a state,
not a federal, responsibility and the greater part of policy making is in the hands
of local school districts, the picture of language instruction in American schools
is surprisingly similar from one part of the country to another. About five-eights
of secondary schools offer some FL instruction, but in 71 percent of these less
than half of the students are enroled in FL courses (Oxford and Rhodes 1988).
The most common pattern is probably two years of instruction in Spanish. This
lack of commitment to FL instruction in public education is unique among in
dustrialized nations. According to many observers FL enrollments have "bot
tomed out" and started a slow rise. But the fact remains that American
educators give small place to FL instruction and on the whole do not expect stu
dents to acquire a working competence in the language they study; the brief ex
posure to a foreign language serves more as an inoculation against further study
than as a foundation for achieving advanced levels. Ringing statements by na
tional commissions and several political leaders to the effect that American
competence in FLs is disgraceful and a danger for the national wellbeing have
not yet led to significant changes in the pattern.
FL INSTRUCTION AND SLA RESEARCH IN THE US 5
The small place for FL instruction in public education is compatible with

the widespread American view that bilingualism is a handicap, a mark of inade
quate control of English, and a sign of membership in an unassimilated and
presumably otherwise disadvantaged minority group. The support for bilingual
education symbolized by court decisions, federal legislation, and state and local
programs has been won on the grounds of equality of opportunity and quicker
transition to English, not on grounds of conserving the nation's FL resources
(Campbell and Schnell 1987).
Not surprisingly, one of the bright spots in FL instruction and research in
the United States involves the teaching of English to speakers of other lan
guages (TESOL). Although the United States has a long history of teaching
English to immigrants in the workplace and in so-called Americanization
classes, the modern TESOL profession arose primarily in connection with teach
ing English to foreign students attending American universities and has ties with
linguistics that go back to the intensive FL programs common during World
War II. The TESOL profession has been an important locus of American re
search in second language acquisition, a fact largely responsible for the existence
of a generation of American SLA specialists who do not themselves speak a sec
ond language. One can point to outstanding exceptions of Americans with exten
sive overseas experience or with outstanding FL specialization, but the majority
of contributors to the active American scene of SLA research still belong to this
English-oriented group.
One of us remembers vividly an occasion in the 1960s when he was invited
to talk about SLA research to a group of university students in Sweden. He did
what he was invited to do, in English of course, with the humbling awareness
that no corresponding group of American graduate students could have followed
and participated in such a discussion in a language other than English, whereas
most of the Swedish students could do so in two or more foreign languages. Re
searchers do riot necessarily need to have personal experience with the phe
nomena they want to investigate; in fact, second-language-competent SLA
investigators may unconsciously assume that other learners have acquired their
second language competence in more or less the same way that they have. But it
is at least worth noting that many American SLA researchers have little or no
FL competence while most European SLA researchers have experienced the
phenomena under study.
1.3 Language Professions
Another feature of the language situation in the United States that is rele
vant to our understanding of the learning and teaching of FLs is the existence of
four different language professions, each with its own occupational goals, educa
tion or special training, and attitudes on language education issues: FL teachers,
bilingual education specialists, teachers of English as a second language, and
teachers of English as a native language. These groups, who could be strong al
lies if they shared important aspects of their educational perspectives and saw
complementary roles for themselves in the American educational system, gener
ally see one another as adversaries or, at best, as professionally unrelated. We
will not attempt here to address the relation between the study of literature and
FL instruction as such —a problematic issue in most European and American
educational systems.
1.4 FL Instruction outside Public Education
A considerable amount of second language learning takes place outside the

FL sector of public education. Private schools tend to offer more and better FL
instruction than public schools do, but they still fall within the patterns already
described. The difference between FL instruction in state and private univer
sities is not so great, but private universities seem to have taken the lead in the
reintroduction of language requirements for entrance and graduation. Outside
the educational system are the numerous commercial language schools, training
programs of corporate enterprises (either internationally oriented or with non-
English-speaking employees), and the SLA that takes place under non-tutored,
"natural" conditions in the United States and by Americans abroad or in FL
communities at home. "Ethnic" schools have been surveyed (cf. Fishman 1980),
but commercial schools and corporate training programs have not been much
investigated, and information on their various types of FL instruction would be
needed to understand the full range of FL learning in the United States. Untu
tored SLA has in recent years become the focus of valuable research in the
United States and Europe, although the exact relation of its findings to FL in
struction in formal educational settings is still to be clarified.
1.5 Myths about Language
Finally, let us emphasize an aspect of the language situation that is not often
treated explicitly: attitudes and beliefs about language widely held by Ameri
cans. We assume that the members of any speech community, even such a large
and complex one as the United States, share to a considerable degree a set of
such attitudes and beliefs, so-called myths about language (Ferguson and Heath
1981: xxvii-xxx). We assume further that these myths may sometimes be of criti
cal importance for understanding the activities of FL learning and teaching as
well as the SLA research efforts of the community. These myths vary consider
ably by region, social class, and other categories, and they have not been investi
gated as much as the evaluative attitudes toward languages and their speakers
(cf. Ryan and Giles 1982). Some of them, however, merit notice.
First, Americans tend to regard competence in an FL as a kind of all-or-
none personal attribute not particularly related to the process of acquisition or
the nature and level of proficiency. People have the competence or they don't:
"Does so-and-so speak Chinese?" "I don't know Spanish". Americans generally
assume (with some justification, of course) that there is little connection be
tween having studied a language and "knowing" it or being able to use it. The re
search corrective to this myth is the current concern with proficiency testing and
other forms of measurement of language competence. Richard Lambert has
called for a "common metric measuring in an objective, consistent fashion the
degree of proficiency a person... has in foreign language". (Lambert 1987:13)
Related to this failure to connect the processes of acquisition to the level of
competence is the notion that there are only a few "real" — one might almost say
"magical" —ways to learn a language. Many people have assured us at one time
or another that the only way to learn a foreign language is to be exposed to it in
childhood, or to live in a country where it is spoken, or (usually said with a
smile) to have a mate or lover who speaks the language. The widespread belief
that living in the appropriate country will produce fluency in a language is evi
denced, for example, in the disappointment that many Stanford undergraduate
students feel after one or two quarters at a Stanford overseas campus, when they
find that they have not automatically reached full fluency. American students
typically do not expect to learn to use a language by studying it in school (and
neither do their teachers or the surrounding community), but they do expect to
learn it by being in the country, having no inkling of the time, effort, and com
municative strategies required. When Americans are faced with a need to ac
quire some FL competence and the options just discussed are not available, they
want the fastest, most efficient, most painless method, preferably one that fea
tures some new technology. The research counterpart to this view is the peren-
niai concern to test different "methods" to see which one is best, that is, most ef
ficient.
A third myth concerns the way people differ in their ability to learn lan
guages. Americans believe that aptitude is very important. Although many as
sume that their compatriots in general have low language aptitude, they assume
just as strongly or more so that individuals differ greatly in language aptitude.
Many individual Americans claim that they themselves have no aptitude for lan
guages and could never learn one, whereas some people they know are, as they
say, "good at languages". Several first-rate American universities make provi
sion to waive their language requirement if a test shows that a particular student
has poor language aptitude.
In this connection, it is interesting to compare attitudes toward foreign com
petence in English with those toward American competence in FLs. An Ameri
can's lack of competence in an FL is often attributed to low aptitude. In
contrast, a foreigner's lack of competence in English may be attributed to lack of
opportunity, clannishness, laziness, or other explanatory factors, but rarely to
lack of aptitude. Incidentally, an attitude not often verbalized but apparent from
incidental comments and behavior is that a foreigner with an excellent command
of English is somehow more intelligent and more competent in other ways than
one whose command of English is less good.
In addition to the emphasis on aptitude, Americans hold conventionalized
notions, almost stereotypes, about the relative difficulty of languages. They as
sume that there is some kind of absolute scale of difficulty such that Spanish is
easier to study or to learn than French, or a more nuanced scale such that Span
ish is easier in the first year but harder in the second year. This view contrasts
with the implicit assumption of most American linguists that all languages are
roughly equal in difficulty for the newborn and differences in difficulty in SLA if
they exist, are due to the nature of the structural differences between L1 and L2
(shades of contrastive analysis!). Linguistic theories that make allowance for
measurement along these lines, such as those involving markedness or par
ameter-setting, could contribute to the understanding of these questions.
2 Research on second language acquisition
On the theory side, SLA research in the United States has tended to be tied
either to linguistics or to psychology, and the tendency has often been to "apply"
a theoretical model derived from quite different contexts of language use rather
than to deal with SLA phenomena as the source for theory construction. Interes
tingly, the USSR (and prerevolutionary Russia) has had the same pattern of the-
ory application from linguistics and psychology (Pitthan 1988) and has experi
enced the same failure to construct theories that start from SLA, although the
patterns of teaching and learning FLs in the Soviet Union are dramatically dif
ferent from those in the United States.
2,1 Research Paradigms
Over the past decade and a half, research on second-language acquisition

has burgeoned to the point where even a brief lay-of-the-land discussion
becomes a formidable task. A cursory review of several recent textbooks in the
field reveals numerous approaches that have variously been labeled "theories",
"models", or "hypotheses" of SLA. The acculturation model or pidginization hy
pothesis and the monitor model are listed by Gardner (1985), Ellis (1985a),
Klein (1986) and McLaughlin (1987). Ellis and McLaughlin list the universal hy
pothesis, which seems to be similar to Klein's identity hypothesis. In addition,
Ellis includes accommodation theory, discourse theory, a variable competence
model, and a neurofunctional model. McLaughlin covers what he calls cognitive
theory, while in Klein we also find contrastive analysis and learner varieties,
which seems akin to what others have referred to as "interlanguage studies" (cf.
Long and Sato 1984). Gardner's review of models from social psychology in
cludes Carroll's conscious reinforcement model, Bialystok's strategy model,
Lambert's social psychological model, Clement's social context model, and Giles
and Byrnes' intergroup model, as well as his own socio-educational model. Yet
to date, there exists no comprehensive theory that captures all of the various
contexts of occurrence and products and processes that have traditionally been
the domains of different "parent" disciplines. At the same time, while the most
immediate goal of SLA research is perhaps to understand better those products
and processes and the effects of context on them, implicit in all of the research
are sometimes divergent long-term goals as well: to contribute to the discipli
nary bases through a greater understanding of broader issues of the nature of
language and learning and, in the more "applied" sense, to facilitate the lan
guage learning process itself.
This pluralism in SLA theory has been viewed unfavorably in the field. Re
searchers seem to feel more and more that the emergence of a single dominant
SLA paradigm would signal the maturation of the field as a discipline (cf.
Rutherford 1984; Long 1985; Gregg 1989; and others). This view can probably
be traced to Kuhn's (1962) work, The Structure of Scientific Revolutions, in which
the social sciences are presented as being in a pretheoretical state because, un
like the "mature" hard sciences, they do not share an implicit and pervasive
commitment to a single set of assumptions about questions, topics, research

sites, units of analysis, and methods of observation and analysis.
While recognizing the need for theory building, we tend to side with Shul-
man, who has recently pointed out that Kuhn erred (and we might add SLA is in
danger of erring) in "diagnosing this characteristic of the social sciences as a de
velopmental disability" (1986: 4). He cites the philosopher of science Feyera-
bend, who says:
"You can be a good empiricist only if you are prepared to work with many al
ternative theories rather than with a single point of view and 'experience'. This
plurality of theories must not be regarded as a preliminary stage of knowledge
which will at some time in the future be replaced by the One True Theory"
(1970:14).
We do not mean to say that research on SLA should not be theory driven.
But Shulman raises an important caveat against the potential trivialization of the
field by a single paradigmatic view. While theory drives much of research (some
would say it should drive all research), there are many kinds of theory that need
to be taken into account in SLA.
The name of the field of inquiry itself suggests need for both a theory of
language and a theory of learning. Given the current state of linguistic theory in
the United States, one can find any number of competence and performance
models. The same could be said of learning theory, although any theory of lear
ning would necessarily include some specification of an initial state, a motivation
to learn, a specification of input, an acquisition procedure, and a description of a
desired state. In addition, researchers who deal in tutored contexts need a model
of teaching. Closely related to all of these areas is a theory of research design. In
the following sections, we review some research on learning contexts, on the na
ture of language, on the acquisition process, and on teaching behaviors believed
to facilitate learning.
2.2 Learning Contexts
Several taxonomies for the contexts of teaching and learning second langu
ages are common in the literature. One involves the labels assigned to teaching
methodology. Some years ago, researchers hoped that a comparison of "me
thods" would lead to an optimal one for language learning. That kind of re
search, which takes method as the unit of analysis, has proven not very fruitful.
Several authors (Brumfit this volume; Larsen-Freeman this volume; Long this
volume) critique this line of research; we will not review their arguments here.
Other taxonomic distinctions, however, persist in contemporary research.
One is that between tutored and untutored language learning. Another divides
the second language learning field into second language, foreign language, and
bilingual education. Both distinctions implicitly reflect differences in degree, if
not in kind, of the processes and products under investigation. While not dispa
raging the practical worth of these taxonomies, they are useful only so long as
the contextual features used to form the bases of the taxonomies differ signifi
cantly across categories and are sufficiently uniform within them.
One danger is that these taxonomic distinctions may obfuscate both cultural
and individual differences. For example, DeKeyser's (1986) description of the
learning strategies of a group of American students in a one-semester study ab
road course in Spain will ring familiar to anyone who has had experience with
American students in similar programs, regardless of the host country. At the
same time, individual differences within the group were striking, even though
they were in the same FL program.
Within the North American context, research on these issues has tended to
concentrate north of the U.S.-Canada border. In his review of social psychology
and SLA, Gardner argues that, among the various individual differences exami
ned in the SLA literature, an integrative motive (broadly defined) and "language
aptitude are the only two individual differences which have been well documen
ted to date as being implicated in the language learning process" (1985: 83). He
argues further that changes in social attitudes may be affected by second langu
age learning experiences and that these changes are perhaps greatest when pro
grams involve novel experiences of rather short duration, such as intensive
bicultural experiences among students who maximize contacts with native spea
kers or in short intensive programs.
From this perspective, if parents and community play a role in socialization
and the formation of attitudes, they also influence the SLA process. Gardner
(1985:146) states:
"Second language acquisition takes place in a particular cultural context...
[T]he beliefs in the community concerning the importance and meaningful-
ness of learning the language, the nature of the skill development expected,
and the particular role of various individual differences in the language lear
ning process will affect second language acquisition".
To the extent that Americans hold various "myths about language", re
searchers would want to know what communities expect of foreign language
classrooms, what Americans perceive as "good" in foreign languages, and how

these expectations become institutionalized. These attitudes would have import
ant implications for language policy. Yet to date, most models of SLA emerging
in the United States have tended to overlook individual and contextual differen
ces in favor of other questions.
2.2.1 Formal theories of Language

Studies that focus on the nature of language include those within formalist
syntactic frameworks, such as Chomsky's government-binding (GB) (1981),
Perlmuter's (1983) relational grammar, Bresnan's (1982) Lexical-Functional
Grammar, and Gazdar et al.'s (1985) Generalized Phrase Structure Grammar.
Not all of these claim to have implications for acquisition. For example, Gazdar
et al. state with reference to Generalized Phrase Structure Grammar:
"In view of the fact that the packaging and public relations of much recent lin
guistic theory involves constant reference to questions of psychology, particu
larly in association with language acquisition, it is appropriate for us to make a
few remarks about the connections between the claims we make and issues in
the psychology of language. We make no claims, naturally enough, that our
grammatical theory is eo ipso a psychological theory. Our grammar of English
is not a theory of how speakers think up things to say and put them into words.
Our general linguistic theory is not a theory of how a child abstracts from the
surrounding hubbub of linguistic and non-linguistic noises enough evidence to
gain a mental grasp of the structure of a natural language. Nor is it a biological
theory of the structure of an as-yet-unidentified mental organ. It is irrespon
sible to claim otherwise for theories of this general sort" (1985: 5).
Other theories, such as Lexical Functional Grammar (LFG), have not yet
been applied to SLA, although Pinker's work (1984) within an LFG framework
on first language acquisition portends that it will. Rosen (1987) explores the im
plications between Relational Grammar and SLA. While Newmeyer (1987)
points out that many of the assumptions of these frameworks are converging, the
bulk of the work on SLA within formal theories of grammar reflects a stron
commitment to government-binding, and has focused solely on linguistic aspects
of initial and final state. A clear articulation of this position is found in Gregg
(1989).
The argument about SLA theory seems to be as follows. Since they don't
have a complete theory of language, researchers can't look at language acquisi
tion. Instead they should look at the acquisition of linguistic or grammatical
competence (the terms are used interchangeably throughout our paper). Gram
matical competence is defined as our intuitive knowledge of the syntax, phono-
logy, and to some extent semantics of the language in question. One assumption
within this framework is that grammatical competence is independent of lan
guage use and involves a mental system that is quite separate from pragmatic
knowledge, conceptual knowledge, perception, and other human faculties. This
has been called the autonomous nature of grammar. At the same time, one
sense in which language is perceived to be modular is that its use results from
the interaction of linguistic competence with other mental faculties or modules,
involving, for example, pragmatic knowledge, conceptual knowledge, and per
ception.
Gregg's rather strong position is that SLA should be centrally concerned
with the acquisition of linguistic competence. In addition to providing a sense of
direction to the field of SLA, such an orientation would bring other advantages
to the field, he maintains: a "rigor" inherent in formal approaches and a knowl
edge of what is innate in language and what is acquired.
These apparent advantages can also be seen as problematic areas for formal
approaches as well. To date agreement on the relevant parameters and their le
vels of expansion is far from universal. For example, working within a GB
framework, Huang (1982) and Koopman (1984) offer differing explanations for
head direction in Chinese, which, as has been pointed out in the literature (Eu
bank 1988; Bley-Vroman and Chaudron 1987; Klein 1987), have different ef
fects on the interpretation of SLA data.
A second problem involves the tapping of a learner's intuitions about a sec
ond language. Coppieters (1987) argues that the linguistic competence of even
very fluent second language speakers differs in unexpected ways from that of na
tive speakers. Furthermore, Birdsong (1988) points out that, while such research
intends to describe the learner's grammatical competence at any given point in
time as evidenced by intuitions about the second language, the interaction of
multiple cognitive mechanisms (modularity) makes it difficult to base judgments
about underlying linguistic competence on performance data such as imitation
tasks.
A final problem to which formalist theories have given little attention is the
process of acquisition, either in the sense of accounting for how a learner is
"driven" from one stage of knowledge to another, or in the sense of providing a
theory of the actual time course of acquisition. As Marshall (1979) points out
and Berwick and Weinberg (1986) reiterate, "No one has seriously attempted to
specify a mechanism that "drives" language acquisition through its "stages" or
along its continuous function" (Marshall 1979: 443). That is, it is not always clear
what the learning process includes, how learners' linguistic competence changes
from time 1 to time 2. For example, in distinguishing between the acquisition of
linguistic competence and communicative competence, Gregg (1989: 34-35)

writes of his own experiences:
"Japanese is a pro-drop language, and knowing that, I drop pronouns left and
right—including at times when a native speaker would not. That is to say that I
don't yet know the discourse restraints (at least) on pronoun-dropping in
Japanese, and thus my 'communicative competence' is not up to native stand
ards".
Apparently, this model views the acquisition of linguistic competence as in

stantaneous. Variation is a matter of pragmatic competence, clearly out of the
realm of legitimate inquiry for those interested in the acquisition of syntax.
2.22 Functionalist approaches to language

While formalist approaches to SLA are primarily concerned with the lear
ner's state of grammatical competence, as exemplified through intuitive judg
ments of grammaticality, other researchers have focused more on the process of
acquisition (that is, moving from one state to another) as revealed through the
system, variability, and change in the learner's production and comprehension.
At the risk of oversimplifying, we might call much of this research "functiona
list".
As an approach to the study of language, functionalism traces its roots to
European scholars. In the United States it does not represent a single unified
theory so much as an emerging school of thought that defines beginning assump
tions, proper goals, and interpretive conceptions for investigations. Nor is it in
principle, as Kuno (1987: 1) points out, in conflict with current formal models of
grammar such as government-binding. However, some beginning assumptions of
this approach do part ways with those of most formal theories in important re
spects, and these differences have implications for the ways research in con
ducted.
While most functionalists recognize language as a biological system, in this
view, the innate capacities that account for language ability are not necessarily
domain-specific (autonomous). A commonly held goal within this research pro
gram is to uncover more general universal cognitive abilities which underlie lan
guage use and acquisition. Grammar is seen as a solution to the problem of
mapping nonlinear representations on a linear channel.
Following from that view of grammar, most functionalist approaches object
to the formal separation of morphosyntàx (or grammar) from semantics and
pragmatics. The common view is that all aspects of language, including acquisi-
tion, are driven by communicative need. MacWhinney, Bates, and Kliegl (1984)
write: "The forms of natural languages are created, governed, constrained, ac
quired, and used in the service of communicative functions".
From this perspective, any explanation of linguistic phenomena cannot ex
clude semantic and pragmatic considerations. Silva-Corvalán makes this claim
most explicit in her discussion of Muysken's (1981) hierarchy of markedness for
tense as applied to data on language attrition: "In my view of language as a sys
tem of human communication, to be explanatory, a markedness hierarchy needs
to be justified with reference to factors which lie outside the linguistic system,
namely cognitive and interactional factors" (1987:14).
These assumptions have implications for what is deemed legitimate terrain
for second language acquisition research. Rather than an overriding concern
with abstract formulations of linguistic competence, SLA researchers working,
either explicitly or implicitly, within this framework have been concerned with
the production of discourse rather than clause length phenomena (e.g. Hatch
1978; Tomlin 1984), with intra-speaker variation (e.g. Tarone 1984; Ellis 1985b),
with changes over time as exemplified by learner production of naturally occur
ring speech (e.g. Huebner 1983; Sato 1985), with the nature of linguistic input
(e.g. Chaudron 1985), and with strategies employed for comprehension and pro
duction (Faerch and Kasper 1987; Chamot et al. 1988).
This more general approach also has its problems. Its emphasis on language
in use has often resulted in a failure to tap the full range of what a learner
"knows" about the language being acquired. In addition, often research of this
type has not clearly articulated the relationship between aspects of language use
and acquisition of specific features of a given linguistic system. Finally, as Gregg
(1989) justifiably points out, it has often failed to distinguish between what lear
ners do because they are not fully proficient in the target language and what they
do by virtue of being human.
Given the current state of affairs of all linguistic theories, the prospects are
as promising for SLA to contribute to them as vice versa. While one finds
numerous claims that SLA is in fact doing so, to date the research in this field
has been more of a confirmatory nature (cf. Huebner 1987).
3 Models of Learning
Another large body of SLA research on the American scene has focused on
the learning and teaching of second languages. Work in social psychology, such
as Gardner's (1985) and Giles and Byrne's (1982), looks at motivation and
larger social variables in second language learning; other research has drawn
heavily on interactional models of discourse to isolate those features of interac-
tion that presumably facilitate learning. The most comprehensive published re
view is Chaudron's Second Language Classrooms; Research on Teaching and
Learning (1988). Here we highlight some conclusions that can be drawn from it.
First, while correlations can be found — between for example: (1) modifications
in teacher talk and in-class versus out-of-class interaction; (2) input generation
and proficiency; (3) task type and type or amount of interaction; (4) amount of
teacher talk and language proficiency of learners; (5) learner production and
achievement test scores; (6) learners' negotiation behaviors and proficiency—
there is little study of the causal relationship between the members of these
pairs. Second, the vast majority of the studies cited in Chaudron, and presum
ably the bulk of the research in this area, look at English as a second language
classrooms. Few studies focus on the range of teacher and student behaviors and
interaction patterns in FL classes in the United States. Third, the bulk of the
studies cited in Chaudron are of the process-product, or more accurately the
pseudo-process-product, variety. Very few classroom-centered qualitative
studies of SLA, and virtually none of FL acquisition, exist.
Finally, there are few studies that take a programmatic look at instructional
programs, especially with respect to FL teaching and learning in the United
States. For example, most university-level FL programs offer courses such as
"Advanced Conversation" and "Grammar Review", which are usually offered to
students at specific junctures in their language learning careers. Yet little re
search of which we are aware carefully examines either instructional goals and
outcomes in these "specialized" language courses or the assumptions about FL
learning that motivate their inclusion at those junctures.
4 Conclusions
We have tried to present a picture of the context of SLA research in the

United States, and to outline broadly and critique briefly some of the major re
search trends in the field today within that context. What emerges is a complex
picture of the acquisition process, as seen by researchers from various persua
sions. To deal with this complex phenomenon, Huebner (1987) has called for
the emergence of more complex research designs and research programs in SLA
that include experiment and ethnography, quantitative and case studies. Such
approaches carry with them the serious danger of disintegrating into utter chaos
without a careful articulation of the questions asked and the types of knowledge
produced. The alternative, however, would be to reduce the richness of the field
to "nothing more than the atomism of a multiple variable design" (Shulman
1985), and that, in our view, would be even worse.
References
Berwick, R.C. and A.S. Weinberg. 1986. The grammatical basis of linguistic performance: Language
use and acquisition. Cambridge: MIT Press.
Birdsong, D. 1988. Second-language acquisition theory and the logical problem of the data. Paper
presented at the eighth Second Language Research Forum, University of Hawaii, Manoa,
March.
Bley-Vroman, R. and C. Chaudron. 1987. A critique of Flynn's parameter setting model of second
language acquisition. Unpublished manuscript, University of Hawaii, Manoa.
Bresnan, J. 1982. The mental representation of grammatical relations. Cambridge: MIT Press.
Brumfit, C. This volume. "Problems in defining instructional methodologies.
Campbell, R.N. and S. Schnell. 1987. "Language conservation." Annals of the American Academy
of Political Social Sciences 490.177-185.
Chamot, A.U., J.M. O'Malley and L. Kupper. 1988. Learner strategies for listening comprehen
sion in English as a second language. Paper presented at the American Educational Re
search Association Annual Meetings, New Orleans, April.
Chaudron, C. 1985. "Intake: on models and methods for discovering learners' processing of
input." Studies in Second Language Acquisition 7/1.1-14.
Chaudron, C. 1988. Second language classrooms: Research on teaching and learning. Cambridge:
Cambridge University Press.
Chomsky, N. 1981. Lectures on government and binding. Dordrecht: Foris Publications.
Coppieters, R. 1987. "Competence differences between native and near-native speakers." Lan-
guage 63/3.544-573.
DeKeyser, R.M. 1986. From learning to acquisition? Foreign language development in a U.S. class-
room and during a semester abroad. Ph.D. thesis. Stanford University.
Ellis, R. 1985a. Understanding second language acquisition. Oxford: Oxford University Press.
Ellis, R. 1985b. "Sources of variability in interlanguage.Applied Linguistics 6/2.118-131.
Eubank, L. 1988. Parameters in L2 learning: Flynn revisited. Paper presented at the eighth Sec
ond Language Research Forum, University of Hawaii at Manoa, March.
Faerch, C, and G. Kasper. 1987. "The role of comprehension in second-language learning." Ap-
plied Linguistics 7/3.251-274.
Ferguson, CA., and S.B. Heath. 1981. "Introduction." Language in the USA ed. by CA. Ferguson
and S.B. Heath. Cambridge: Cambridge University Press.
Feyerabend, P. 1974. "How to be a good empiricist—a plea for tolerance in matters epistemologi-
cal." The philosophy of science ed. by P.H. Hidditch, 12-39. Oxford: Oxford University Press.
Fishman, JA. 1980. "Ethnic community mother tongue schools in the USA: Dynamics and dis
tributions." International Migration Review 14.235-247.
Fishman, J.A., V. Nihirny, J. Hoffman and R. Hayden. 1966. Language loyalty in the United States.
The Hague: Mouton.
Gardner, R.C 1985. Social psychology and second language learning: The role of attitudes and moti-
vation. Baltimore: Edward Arnold.
Gazdar, G. et al. 1985. Generalized phrase structure grammar. Oxford: Basil Blackwell.
Giles, H. and J.L. Byrne. 1982. "An intergroup approach to second language acquisition." Journal
of Multicultural and Multilingual Development 3/1.17-40.
Gregg, K.R. 1989. "Linguistic perspectives on second language acquisition: What could they be,
and where can we get some?" Linguistic Perspectives on Second Language Acquisition ed. by
S.M. Gass and J. Schachter, 15-40. Cambridge: Cambridge University Press.
Hatch, E.M. 1978. "Discourse analysis and second language acquisition." Second language ac-
quisition: A book of readings. ed. by E.M. Hatch. Rowley, MA: Newbury House.
Huang, C.J. 1982. Logical relations in Chinese and the theory of grammar. Ph.D. thesis. Massachu-
setts Institute of Technology.
Huebner, T. 1983.A longitudinal analysis of the acquisition of English. Ann Arbor: Karoma.
Huebner, T. 1987. SLA: a litmus test for linguistic theory? Paper presented at the conference on
Second Language Acquisition: Contributions and Challenges to Linguistic Theory, Stanford
University, July.
Klein, W. 1986. Second language acquisition. Cambridge: Cambridge University Press.
Klein, W. 1987. SLA theory: prolegomena to a theory of language acquisition and implications for
theoretical linguistics. Paper presented at the conference on Second Language Acquisition:
Contributions and Challenges to Linguistic Theory, Stanford University, July.
Koopman, H. 1984. The syntax of verbs. Dordrecht: Foris Publications.
Kuhn, T.S. 1962. The structure of scientific revolutions. Chicago: University of Chicago Press.
Kuno, S. 1987. Functional syntax: Anaphora, discourse and empathy. Chicago: University of Chica-
go Press.
Lambert, R.D. 1987. "The improvement of foreign language competence in the United States."
Annals of the American Academy of Political and Social Science 490.9-19.
Larsen-Freeman, D. This volume. "Research on language teaching methodologies: a review of the
past and an agenda for the future."
Long, M. 1985. Theory construction in second language acquisition. Paper presented at the sixth
Second Language Research Forum, University of California at Los Angeles, February.
Long, M. This volume. "Focus on form: a design feature in language teaching methodology."
Long, M. and C. Sato. 1984. "Methodological issues in interlanguage studies: an interactionist
perspective." Interlanguage ed. by A. Davies, C. Criper and A.P.R. Howatt, 253-279. Edin
burgh: Edinburgh University Press.
MacWhinney, B., E. Bates and R. Kliegl. 1984. "Cue validity and sentence interpretation in Eng
lish, German, and Italian." Journal of Verbal Learning and Verbal Bahavior 23/1.127-150.
Marshall, J.C. 1979. "Language acquisition in a biological frame of reference." Language Acquisi-
tion ed. by P. Fletcher and M. Garman, 437-453. New York: Cambridge University Press.
McLaughlin, B. 1987. Theories of second language learning. London: Edward Arnold.
Muysken, P. 1981. "Creole tense/mood/aspect systems: the unmarked case?" Generative studies on
Creole languages ed. by P. Muysken, 181-199. Dordrecht: Foris Publications.
Newmeyer, FJ. 1987. "The current convergence in linguistic theory: Some implications for second
language acquisition research." Second Language Research 3/1.1-19.
Oxford, R.L. and N.C. Rhodes. 1988. "U.S. foreign language instruction: Assessing needs and cre
ating an action plan." ERIC/CLL News Bulletin 11/2.1 + 6-7.
Perlmutter, D. 1983. Studies in relational grammar 1. Chicago: University of Chicago Press.
Pinker, S. 1984. Language leamability and language development. Cambridge: Harvard University
Press.
Pitthan, I.M. 1988. A history of Russian/Soviet ideas about language: Background to Soviet
foreign language pedagogy. Unpublished Ph.D. thesis. Stanford University.
Rosen, C. 1987. Relational grammar and SLA. Paper presented at the conference on Second lan
guage Acquisition: Contributions and Challenges to Linguistic Theory, Stanford University,
July.
Rutherford, W.E. 1984. "Description and explanation in interlanguage syntax: state of the art."
Language Learning 34/3.12-55.
Ryan, E.B. and H. Giles, eds. 1982. Attitudes toward language variation: Social and applied con-
texts. London: Edward Arnold.
Sato, C.J. 1985. The syntax of conversation in interlanguage development. Unpublished Ph.D.
thesis. University of California at Los Angeles.
Shulman, L. 1986. "Paradigms and research programs in the study of teaching: A contemporary
perspective." Handbook of research on teaching (3rd ed.) ed. by M.C. Wittrock, 3-?. New
York: MacMillan Publishing Company.
Silva-Corvalán, C. 1987. Cross-generational bilingualism: theoretical implications of language at
trition. Paper presented at the conference on Second Language Acquisition: Contributions
and Challenges to Linguistic Theory, Stanford University, July.
Tarone, E. 1984. "On the variability of interlanguage systems." Universals of second language ac-
quisition ed. by F.R. Eckman, L.H. Bell and D. Nelson, 3-23. Rowley, MA: Newbury House.
Tomlin, R.S. 1984. "The treatment of foreground-background information in the on-line descrip
tive discourse of second language learners." Studies in Second Language Acquisition 6/2.115-
142.
Veltman, C. 1983. Language shift in America. Berlin etc.: Mouton.
Waggoner, D. 1981. "Statistics on language use." Language in the USA ed. by C.A. Ferguson and
S.B. Heath, 486-515. Cambridge: Cambridge University Press.
Empirical Foreign Language Research in Europe
Theo van Els, Kees de Bot & Bert Weltens
The purpose of this paper is not to present a full survey of past and ongoing
empirical research in Europe on foreign language teaching (FLT), even if there
may well be a great need for such a survey. An authoritative source of informa
tion on educational research like the Handbook of Research on Teaching (Merlin
C. Wittrock, ed., 1986,3rd ed.), a project of the American Educational Research
Association, which in the European context would certainly have an article on
foreign language teaching—besides, or instead of, one on 'teaching bilingual
learners' (Wong Fillmore and Valadez 1986) — , does not exist in Europe. Nor
are there many good and systematic incidental treatments of empirical research
in any number of the relevant sub-fields of our field of action. The scope of this
overview is a much more limited one; the main questions will concern the fol
lowing aspects of FLT research in Europe:
1. the state of FLT provisions;

2. the state of empirical FLT research;
3. requirements for the near future.
In this way we hope to provide some insight into past and ongoing develop
ments in the European scene of FLT research, and to suggest some directions
that future research might take.
22 THEO VAN ELS, KEES DE BOT & BERT WELTENS
1 The state of FLT provisions
In order to give an impression of the European landscape as regards FLT

provisions, we will briefly deal with five Western European countries, viz. the
Federal Republic of Germany (FRG), England, France, Sweden, and the
Netherlands.
In primary education foreign languages are compulsory in two countries
only: Sweden and, recently, the Netherlands. The language being taught is Eng
lish in both cases; in Sweden it starts at the age of 9, in the Netherlands at the
age of 10. In a number of other countries there is some FLT, but the children at
tend it on a voluntary basis: in England this is always French; in the FRG it is
either English or French.
In secondary education Sweden and the Netherlands appear to have more
comprehensive FLT provisions than the other countries. In Sweden English is a
compulsory subject for everybody, and a significant number of students have to
choose at least one other foreign language (either German or French), and at a
later stage a third language may be added (mostly German, French, or Russian).
About 65% of all students choose to learn a second foreign language besides
English, about two-thirds of whom choose German, and one-third French (Hen-
ningsson 1986: 4).
In the prevailing Dutch system of secondary education there is a division
into a general and a more vocation-oriented type right from the first year on
wards. In the general type (approximately 65% of all pupils), three foreign lan
guages (English, German, and French) are compulsory during the first phase; in
the second phase — after three or four years — every pupil has to continue learn
ing at least one foreign language—English in virtually all cases—, and has the
possibility to choose a second or even a third language. A second foreign lan
guage is chosen by 65% of these pupils (mostly German or French), a third by
14%.
In the vocation-oriented type only one foreign language is obligatory right
from the beginning. There is no specification as to which language, but most
schools teach English.
In the other three countries, participation of pupils in FLT programs is
generally speaking lower. In the FRG the picture is rather complicated, because
the 11 Länder are very autonomous when it comes to laying down educational
policies. However, one compulsory foreign language is found everywhere, and
that is usually English. In the grammar school type usually two foreign languages
are compulsory, the second usually being French or Latin, and a third language
is optional in many places.
EMPIRICAL FOREIGN LANGUAGE RESEARCH IN EUROPE 23
There is a wide diversification in the field of foreign languages in the

French system. All students learn one compulsory foreign language, but they
themselves decide which. All schools have to offer English, German, and Span
ish-if a certain minimum number of pupils express an interest-and for some
schools the same holds good for Italian and Russian. Figures provided by Zapp
(1979: 18-19) show that, even when there is a completely free choice for stu
dents and/or their parents, English has in fact a monopoly position.
England, finally, has always had a strong tradition of delegating educational
policy making to the schools. According to DES (1983: 3) almost all pupils start
secondary school with at least one foreign language, mostly French. Data from
examinations show that only one-third of the population takes French up to the
so-called O-level.
2 The state of empirical FLT research in Europe
In order to get an insight into past and ongoing developments in terms of

empirical research efforts, a quantitative evaluation of American and European
publications on empirical research in this field was undertaken. This evaluation
was based on three analyses of numbers of publications.
Using the number of publications as an indicator of research effort is, no
doubt, whichever way one looks at it, an indication of the state of affairs in a
field of study. One knows, of course, that all kinds of objections can be raised
against such counts. We are not going to deal with these problems here and, al
though we are well aware that even more objections can be raised against these
tentative and provisional counts, we still think that the counts presented here
are certainly indicative of a number of things.
Table 1. Analysis I: Number of publications dealing with FLT research.
1966-71 1972-76 1977-81 1982-87 TOTAL
Belgium - 4 4
FRG 4 9 19 47 79
France 1 - 2 6 9
Great-Britain 1 4 3 5 13
Netherlands - 3 7 15 25
Scandinavia 3 8 2 6 19
Other W&E. Eur. countries - 2 3 4 9
SUB-TOTAL EUROPE 9 26 36 87 158
USA/Canada 6 11 7 31 55
Other countries 2 3 5
TOTAL 15 37 45 121 218

For the first two analyses, which were adapted from Van Els (1988), we
used the fairly representative collection of books and journals in the field of ap
plied linguistics at our department. All the important international journals are
represented and there are about 5000 volumes: handbooks, monographs,
proceedings and readers, not including foreign language teaching materials, of
course. All the journals, from their first issues, all the books acquired since
about 1976 and some of the books from before 1976, have been systematically
catalogued in a fully computerized bibliographical system. For analysis I separ
ate lists were printed, for four consecutive periods of 5 or 6 years, of all books
and articles to which the key-word 'foreign language teaching', and also either
the key-word 'empirical research' or the key-word 'research report' had been at
tributed. The total number of items found was 218.
In table 1 these publications have been categorized according to the country
where the research was carried out. What one sees is first of all a steady increase
in the number of publications dealing with FLT research over the past twenty
years; the increase is particularly striking for the fourth period. Secondly, Eu
rope appears to have shown a steadier increase than North America.
In this count, the share that individual countries take in the total output,
varies a great deal. Particularly low is the share of both France and Great-Bri
tain. Where there is an overall increase of the output for all countries over the
period, Scandinavia is an exception to the rule: the number for 1972/76 reflects
the special activities in connection with the well-known GUME-project (see
Von Elek and Oskarsson 1975). Another striking point is the fact that the FRG
has produced a great number of more 'general' works, i.e. works discussing re
search planning, design, or policy, most of them in the last few years.
Table 2. Analysis II: Number of publications dealing with research on FL/L2 learning
and teaching.
1966-71 1972-76 1977-81 1982-87 TOTAL
Belgium - 2 3 11 16
FRG 13 44 85 94 236
France 3 4 3 14 24
Great-Britain 11 8 11 19 49
Netherlands 2 17 48 78 145
Scandinavia 4 8 10 15 37
Other W&E. Eur. countries 6 10 11 19 46
SUB-TOTAL EUROPE 39 93 171 250 553
USA/Canada 33 60 96 122 311

Other countries 1 2 11 14 28
GRAND TOTAL 73 155 278 386 892

In order to validate the figures in table 1, a second analysis was carried out.
In this second analysis all those documents that had been assigned either 'em
pirical research' or 'research report' were again selected, as had been done in
the first, but instead of just adding 'foreign language teaching' as a selection
term, 'foreign language teaching or foreign language learning or second lan
guage teaching or second language learning' was added. This led to a total of 892
publications being selected. They were categorized according to country and
period in the same way, with the results shown in table 2.
As can be seen in table 2, the overall tendencies are comparable to those in
table 1: a general increase over the years, and a relatively minor contribution
from France and Great-Britain. Note, also, that the share from the USA and Ca
nada has risen remarkably (from 25% in table 1 to 35% in table 2), mainly as a
result of the wealth of Canadian publications on L2 learning and teaching.
Nevertheless, this increase does not bridge the gap between Europe and North
America.
In the third analysis, quite a different perspective was taken. We opted for a
count of European publications in a limited set of journals figuring in the Arts &
Humanities Citation Index, which we take as an indication of their scientific im
pact. Nine journals were selected from this corpus on the basis of our estimation
of their relevance for the field. They were the following:
1. Applied Linguistics;
2. Canadian Modern Language Review;
3. Foreign Language Annals;
4. International Review ofApplied Linguistics;
5. Journal of Multilingual and Multicultural Development;
6. Language Learning;
7. Modern Language Journal;
8. System;
9. TESOL Quarterly.
Table 3. Analysis III: Number of publications per country.
FLT FLT SLT SLT Testing Total

+ Emp -Emp +Emp -Emp
FRG 13 15 3 2 3 36
Great-Britain 7 46 12 15 6 86
Netherl./Belgium 6 14 10 0 1 31
Scandinavia 5 10 8 0 1 24
E. Europe 0 5 0 1 0 6
S. Europe 1 19 3 2 1 26
Total 32 109 36 20 12 209

Of these journals we analysed the years 1981-1987. Review sections, 'Notes

and discussion', and the like were disregarded. All European publications were
counted that came under the headings of foreign language teaching/learning or
second language teaching/learning (SLT). These publications were further sub
divided on the basis of their nature being empirical (" + Emp") or not ("-Emp").
Furthermore, we used a category dealing with aspects of designing and evalua
ting language tests, to which we will be referring as "testing."
This third analysis yielded 209 publications: 141 dealing with FLT, 56 with
SLT, and 12 dealing with testing. Before presenting data pertaining to different
countries, it should be pointed out that —in contrast to the previous analyses —
the label "FRG" also includes Austria; that "Great-Britain" also includes Ire
land; that "Scandinavia" also includes Finland; and that "S. Europe" was also
interpreted somewhat liberally to include not only Spain, Italy and France, but
also Greece, Turkey and Yugoslavia. However, all of the countries that were
added in this way occurred very infrequently in the corpus selected.
As far as the totals per country are concerned, the enormous contribution
from Great-Britain is remarkable, while all other countries/regions also make a
reasonable contribution of 24 to 36, except Eastern Europe, which only provides
6 publications. Note also that for Great-Britain, Eastern and Southern Europe
the preponderance of the contributions is non-empirical, whereas for the other
countries there tends to be a balance between the number of empirical and non-
empirical publications.
Testing seemed to be a subject almost exclusively dealt with in the FRG and
Great-Britain. This —somewhat surprising — result was validated by additionally
Table 4. A comparison between the results of the three analyses (%).
FLT +Emp FLT/SLT +Emp

I III II III
FRG 54 40 37 24
Great-Britain 6 22 8 31
Netherlands/Belgium 22 19 36 18
Scandinavia 7 16 6 18
E.Europe 5 0 8 0
S. Europe 7 3 6 6
analysing the first four volumes of Language Testing (1984-1987). Our results
were confirmed: apart from a remarkably strong Israeli contribution, the FRG
and Great-Britain appeared to be the strongest European contributors to the
language testing literature (6 and 11 articles resp. out of a total of 19; the re
maining two came from the Netherlands).
When we want to compare the data from the three analyses, the best com
parison is the number of empirical studies dealing with FLT (Analysis I), and
FLT or SLT (Analysis II) from the period 1981-1987 on the one hand, and the
same categories of studies from the period 1982-1987 (Analysis III) on the other
hand. This comparison is represented in table 4 in terms of percentages per
country. (For the sake of simplicity, we have left out the North-American ar
ticles from analyses I and II, and computed the percentages on the basis of the
European sub-total.)
Table 4 shows that all three analyses yield highly comparable results in
many respects, but there are also a few remarkable differences. On the one
hand, analyses I and II overestimate the German and Dutch/Belgian contribu
tions; this may be attributed to the nature of the database used, which contains a
relatively high proportion of documents written in German and Dutch. On the
other hand, analysis III yields a (much) larger contribution from Great-Britain
and the Scandinavian countries; this may be due to the fact that the preponde
rance of the journals selected publish in English, and the fact that one of the
journals {System) is based in Sweden, respectively. In fact, analyses I and II rep
resent the total research effort within each country, whereas analysis III is
limited to that part of the effort that is likely to have an international impact.
A clear and important finding in analyses I and II was the steady rise in the
number of empirical publications over time. In analysis III, which dealt with a
relatively short period of time, we also looked at this development, with the re
sult presented in table 5.
Table 5. Analysis III: Number of empirical articles per year.
81 82 83 84 85 86 87 Total
FLT +Emp 6 3 6 4 5 4 4 32
SLT +Emp 5 3 3 10 4 5 6 36
Total 11 6 9 14 9 9 10 68
The tendency noted in analyses I and II across the years 1966-1987 appears
not to continue within the 1980s: the number of empirical articles on FLT/SLT
fluctuates around 10 per year, and there is no sign of an increase over the years.
One final comment concerning the figures should be added here. What they
irrefutably show is that there has been an increase of empirical research over the
past two decades. That in itself is very gratifying. What the figures do not show,
however, is how our field compares in this respect to other fields of research.
Whether, therefore, the rate of growth of applied linguistic research is satisfac
tory in comparison with that of other fields, or — for that matter — in proportion
to the need for research in the field of foreign language teaching, we do not
know at all. In addition, a recent survey by the Association of Dutch Universities
(VSNU) showed that the scientific output has increased considerably in all fields
over the last decade.
3 Requirements for the near future
It is surprising to see how in very recent times people in the context of the
European Common Market have been growing acutely aware of the fact that on
1 January 1993 the unification process of the European countries concerned will
take a major step towards doing away with barriers of all kinds between the
countries. The number of those seems to be growing too who realize that lan
guage communication, i.e. the efficient use of the languages of Europe within
the Community itself and also of a number of 'outside' languages, will play an
important and critical part in bringing the process of unification to a successful
end.
A major recent development is the establishment of a vast joint programme
for promoting the teaching and learning of foreign languages in the European
Community, called LINGUA. In the pre-amble of the programme proposal, lack
of foreign language skills is called "the Achilles Heel in the Community-wide ef
fort to make the free movement of persons and ideas a practical reality" (Docu
ment no. 6614-89 of the 1321st session of the Council of Europe and the
Ministers of Education, May 22, 1989). The central aim of the programme-
which is to start in 1990, and for which a fairly large budget has been set aside —
is "to increase the capacity of the citizens of the Community to communicate
with each other by a quantitative and qualitative improvement in the teaching
and learning of foreign languages within the European Community" (o.c.). Simi
lar considerations have — also very recently and at very short notice — incited the
Dutch Ministry of Education to commission the writing of a National Action
Programme for foreign language use and teaching in the Netherlands. The Eu
ropean perspective is to be one of the main issues.
One of the leading principles agreed upon by the European nations is that
the rich and diverse heritage of European cultures and languages should not be
jeopardized in this process. This cannot but mean that the attention for FLT,
which in some of the countries has never been overwhelming so far, will have to
be increased in all the countries. A number of private foundations have been
working towards this end by organizing a series of conferences in the last few
years, bringing together experts in the field of FLT and international communi
cation, politicians and representatives from the world of business and com
merce, to discuss the immense problems and possible solutions. A major role in
this enterprise has been played by the European Cultural Foundation, which
also co-funded the conference from which the present volume arose. It is inter
esting to see that the 'manifesto' drawn up at a previous conference, held in Ma
drid, June 1987, not only stresses the great importance of utilizing Europe's
diversity of cultures and languages and of overcoming the difficulties caused by
that very diversity by a major effort to improve FLT — quantitatively and qualita
tively—, but also stresses the importance of the promotion of empirical research
into all aspects of the problem area, inclusive of the teaching of foreign lan
guages.
In the European context, therefore, it is very gratifying to see that the de
mand from applied linguists for more research into FLT, is backed by a growing
awareness in other, also political, circles that a great deal of work desperately
needs doing. If we can come up with the right ideas for empirical research, it is
our conviction that the opportunities for carrying it out will be made available.
What, now, are the right ideas for empirical research? Everybody will agree
that a first requirement is for our research to be more truly empirical. There is
no need to stress here that repeated statements of just opinions and hunches on
what should be taught and especially on how foreign languages should be taught,
will not further the cause of FLT any more than they have done so far. How the
research effort — both as to what and how should be taught — can be made more
truly empirical, is one of the main themes in the other contributions to this vol
ume.
Our next point is one on which we may not all agree as wholeheartedly. In
his 1988 paper Van Els stressed — as he had done before — that the main source
of inspiration for applied linguistic research should be sought primarily in FLT
itself. As a 'problem-oriented' discipline applied linguistics should be concerned
with questions originating from the actual teaching of foreign languages, and not
from one of the related source disciplines. In the paper in question he argued
that the sometimes vehement academic dispute in the FRG between Sprach-
lehrforscher ('language-teaching researchers') like Karl-Richard Bausch on the
one hand, and Zweitspracherwerbsforscher ('SLA researchers') like Henning
Wode on the other, may well find its main explanation in the fear felt by the for
mer that there is a great danger in the weight given to second language learning
research by Wode that FLT will again be turned into 'the child of fashion' of any
new development in any of the source disciplines, most prominently—of
course—linguistics, but also developmental psychology.
Two further points that we think are of general interest relate to more spe
cific aspects of research into FLT. First of all, it is not uncommon to set the
goals of FLT — i.e. of any programme serving whatever target group — at the hig
hest level imaginable, i.e. at native or at least native-like level. Usually this is
done by people who have not given the matter any serious consideration, but it
also happens that it is a point of view taken deliberately and stated in the most
explicit terms. Our point of view is that not only is it, in by far most instances,
fully unrealistic to set one's goals so high — as everyone would agree—, but that
also different aims set for teaching programmes may fundamentally affect the
teaching and learning that should lead up to those aims. What may be valuable
procedures and practices in one programme, may lose their strengths in pro
grammes in which one attempts to achieve different sets of goals. What point
would there be in stressing absolute correctness of spoken competence in a pro
gramme that sets out to achieve a high level of reading ability? In this kind of
programme there will be very little need for listening comprehension exercises,
let alone for pronunciation drills. The more we aim at explicitly defining particu
lar sets of learning goals aimed at satisfying particular learning needs — 'learning
units', 'modules' — , the more we will have to adopt teaching methodologies tai
lored to achieving those goals with a maximum of effect, with the highest
possible level of efficiency. So far, there is very little empirical evidence as to
which teaching methodologies to choose under those different circumstances.
The second point in this connection — also related to goals of teaching — was
elaborated in a paper by Van Els and Weltens (1989: 23). For brevity's sake, we
will simply quote the relevant passage from the paper:
"The (second) point is that FL loss caused by non-use results in less—and,
possibly, also different—language competence from the competence achieved
right at the end of the FL course. It is often the case in our educational sys
tems that language courses are followed after their completion by a number of
years of non-use, before pupils are expected to apply the language com
petence acquired in real-life communicative situations. In such a case the final
objective of a language course cannot be exactly the same as the competence
required later for actual communicative usage, but—in order to make up for
the loss sustained in the meantime—may well have to be higher and, possibly
even, different."
Now that our project into the loss of school-French in the Netherlands has
been completed, we may have to adapt the previous statement somewhat. For,
surprisingly enough, what we found for the written receptive skills — i.e. reading
comprehension — after two years of non-use was an increase rather than a de
crease (cf. Weltens 1989). But, whichever way, our conclusion cannot but be that
effects of FLT programmes should not be measured merely as the direct out
come of the particular programmes in question, i.e. measured immediately after
the completion of the teaching process.
4 Concluding remarks
Skimming through the 3rd edition of the Handbook of Research on Teaching

(Wittrock 1986), for the purposes of preparing this introductory paper, we were
struck by two things that may have some importance for our present discussions.
It is our experience that in FLT literature very little reference is made, neither
to general educational research nor to research into other school-subjects; the
contributions to the present volume on the whole are no exceptions to this rule.
What the reasons may be for this phenomenon, one may make a fair guess at. It
probably has got to do with the fact already mentioned, viz. that those doing ac
tual research into FLT, have usually done so taking one of the source disciplines,
usually linguistics, as their main or sole source of inspiration. It is our conviction
that taking adequate note of the relevant literature in the two educational fields
mentioned, would help us greatly to formulate our research hypotheses on the
basis of the problem-area itself, i.e. the actual teaching of foreign languages. It
might help some of us, to give just one example, to think twice before hypothe
sizing—on the basis of insights derived from first-language learning research—a
minimal role for the teacher—or for teaching, for that matter — in the process of
learning foreign languages, had they taken proper note of recent research in
other fields. Both Brophy and Good (1986: 370) and Fraser et al. (1987: 235),
for example, report that recent findings have — to quote the latter—"dispelled
the notion that the only important factors in predicting student outcomes are
those that cannot be altered by teachers or the school," or to quote Brophy and
Good: "The myth that teachers do not make a difference in student learning has
been refuted."
References
Brophy, J., and T.L. Good. 1986. "Teacher behavior and student achievement." Wittrock
1986.328-375.
DES (Department of Education and Science). 1983. Foreign languages in the school curriculum.
London: Welsh Office.
Fraser, B.J., HJ. Walberg, W.W. Welch, and J.A. Hattie. 1987. "Syntheses of educational produc-
tivity research." International Journal of Educational Research 11/2.145-252.
Henningsson, B. 1986. "Foreign language teaching in Swedish schools." FIPLV World News 6.3-4.
Van Els, T. 1988. "European developments in applied linguistics." Applied linguistics in society ( =
British Studies in Applied Linguistics, 3) ed. by P. Grunwell, 16-29. London: CILT.
Van Els, T., and B. Weltens. 1989. "Foreign language loss research from a European point of
view." ITL Review ofApplied Linguistics 83/84.19-35.
Von Elek, T., and M. Oskarsson. 1975. Comparative method experiments in foreign language teach-
ing: The final report of the GUME/Adults project. Gothenburg: School of Education.
Weltens, B. 1989. The attrition of French as a foreign language ( = Studies on Language Acquisition,
6.) Dordrecht/Providence, RI: Foris.
Wittrock, M.C., ed. 1986. Handbook of Research on Teaching. 3rd Edition. New York: Macmillan.
Wong Fillmore, L., and C. Valadez. 1986. "Teaching bilingual learners." Wittrock 1986.648-685.
Zapp, F.J. 1979. Foreign language policy in Europe. An outline of the problem. Brussels: European
Cooperation Fund.
Section II—Measurement and Research Design
Introduction to the Section Measurement and
Research Design
Ralph B. Ginsberg
In a volume on research in foreign language teaching and learning it would

seem to go without saying that the field needs research that meets the highest
standards of rigor and relevance we can reasonably apply. Claims about the effi
cacy of one or another language teaching methodology, and intuitions based on
centuries of experience, abound. If there is no consensus, and if we cannot rely
on demonstrably generalizable, patently successful examples to tell us which of
the proposed approaches are valid and spurious, then surely systematic (empiri
cal) research, using methods which have proven successful in other fields, must
hod the key to determining "what works" and what doesn't. But what specific re
search is needed: what do learners need to know to enhance their learning op
portunities? what do teachers, tutors, instructional designers need to know in
order to construct lessons, materials, learning environments, and foreign lan
guage curricula? what do counselors, administrators, and policy makers need to
know in order to place learners in the right settings, to organize effective foreign
language learning institutions, and to see to it that broader national goals requir
ing foreign language skills are met? And how should this knowledge be ac
quired: how should the key variables be measured? what research designs
should be employed to establish which of many contradictory claims, hypo
theses, and explanations are in fact valid? This section contains four papers ad
dressing the latter questions of research methodology: measurement and design.
The papers by Klein-Braley and Oscarson are concerned with measure
ment, or rather with language testing as it is currently employed in classrooms,
schools, and educational systems on both sides of the Atlantic. Both argue for
application of the best psychometric methods to a variety of language testing
Focus on Form:
A Design Feature in Language Teaching
Methodology
Michael H. Long
1 Against methods
Language teacher education programs persist in presenting classroom op

tions to trainees in terms of methods. While many have stopped pretending that
any one method is a panacea or at least that they know which one is, most never
theless continue to use method as a unit of analysis in their professionally
oriented courses, and some even give college credit for training in particular
methods taught by their developers or licensed acolytes. Books on methods sell
very well, books surveying methods do even better, and expensive one-day "sem
inars" offering training in particular methods are rarely short of customers. Yet
it is no exaggeration to say that language teaching methods do not exist — at
least, not where they would matter, if they did, in the classroom.
There are at least four reasons for avoiding the methods trap. First, even as
idealized by their developers, groups of methods overlap considerably, prescrib
ing and proscribing many of the same classroom practices. For example, while
one method may have teachers provide feedback on error using hand-signals,
and one verbally, both prescribe "error correction". Almost all methods in fact
advocate error correction (Krashen and Seliger 1975).
Second, when third parties analyze lesson transcripts — records of what tea
chers and learners actually do, as opposed to what methodologists tell them to
do—brief excerpts can occasionally be identified as the product of this or that
method, but the classifications usually have to be made on the basis of one or
two salient but (as far as we know) trivial features, e.g. whether students are in-
40 MICHAEL H. LONG
formed of the commission of error verbally or non-verbally. Quite lengthy ex

cerpts are often impossible to distinguish, especially if taken from real classes, as
opposed to staged demonstration lessons (Dinsmore 1985; Nunan 1987).
Third, studies that have set out to compare the effectiveness of supposedly
quite different methods (e.g. Scherer and Wertheimer 1964; Smith 1970; Von
Elek and Oskarsson 1975) have typically found little or no advantage for one
over another, or only local and usually short-lived advantages. One interpreta
tion of such results is that methods do not matter. Another is that methods do
not exist, among other reasons, because most teachers tend to do much the same
things (many methods require this, after all), whatever they are supposed to be
doing, especially over time. The absence of a systematic observational compo
nent in most of the comparative methods studies makes either interpretation
problematic. However, the second view is supported retrospectively by descrip
tive studies which have found the same classroom practices surviving differences
not only in "methods" (Nunan 1987), but also in professional training (Long and
Sato 1983), materials (Phillips and Shettlesworth 1975; Long, Adams McLean
and Castanos 1976; Ross, to appear), teaching generations (Hoetker and
Ahlbrand 1969) and teaching experience (Pica and Long 1986).
Fourth, method may or may not be a useful analytic construct for teachers
in training, but it is not a conceptual basis for how they operate in practice.
Numerous studies of the ways content teachers plan lessons and recall them
afterwards show that they think of what transpires in the classroom in terms of
instructional activities, or tasks (for review, see Shavelson and Stern 1981;
Crookes 1986). The same appears to be true of FL teachers. Swaffer, Arens and
Morgan (1982) conducted a six-month comparative methods study ("com
prehension" and "four skills" approaches) of German teaching at the University
of Texas. Classroom observations and debriefing interviews with teachers at the
end of the study showed that, despite the teachers having received explicit train
ing in the methods and (supposedly) having each used one or the other for a se
mester, there was no clear distinction between them in their minds or in the
classroom practices used across groups.
For these and other reasons, it is clear that "method" is an unverifiable and
irrelevant construct when attempting to improve classroom FL instruction.
Worse, it may actually do harm by distracting teachers from genuinely important
issues. Saying that methods do not exist and so do not matter at the classroom
level does not mean, after all, that what goes on in classrooms does not matter.
On the contrary, there is growing evidence of the importance of classroom pro
cesses, of pedagogic tasks, and of qualitative differences in classroom language
use for success and failure in FLs (for review, see Chaudron 1988). Rather than
focus on method as the key, however, we would do better to think in terms of
A DESIGN FEATURE IN LANGUAGE TEACHING METHODOLOGY 41
psycholinguistically relevant design features of learning environments, prefer

ably features which capture important characteristics of a wide range of syllabus
types, methods, materials, tasks, and tests. It is to one of these, focus on form,
that we now turn.
2 Focus on form in language teaching
Many developments in foreign language syllabus design, materials writing,

methodology and testing during the past 30 years reflect the tension between the
desirability of communicative use of the FL in the classroom, on the one hand,
and the felt need for a linguistic focus in language learning, on the other. How
ever, while discussion has occurred in staff-rooms and journals alike, it has
generally concerned how best to achieve such a focus, not whether or not to
have one. Most applied linguists and pedagogues continue to advocate teaching
and testing isolated linguistic units of one kind or another in one way or another.
Thus, while procedural, process and task-based alternatives are available (see
Prabhu 1987; Breen 1987; Long and Crookes 1989), the overwhelming majority
of syllabi are still structural, notional-functional or a hybrid, and superficially
different "methods", like ALM, TPR and the Silent Way, all teach one linguistic
item at a time (or assume they do), in building-block fashion. Pervasive class-
room practices, such as grammar and vocabulary explanations, display questions,
fill-in-the-blanks exercises, dialog memorization, drills and error correction, all
entail treatment of the language as object, and so do discrete-point language
tests.
There have always been a few dissenting voices. Newmark (1966), Newmark
and Reibel (1968), Corder (1967) and Allwright (1976), among others, have ar
gued strongly against "interfering" with language learning. While differing con
siderably both in the detail of their own proposals and in the rationales offered
for them, each has claimed that the best way to learn a language, inside or out
side a classroom, is not by treating it as an object of study, but by experiencing it
as a medium of communication.
More recently, some non-interventionist positions have been espoused on
the basis of second language acquisition (SLA) theory and research findings (see
e.g. Dulay and Burt 1973; Ellis 1984; Felix 1981; Krashen and Terrell 1983;
Prabhu 1987; Wode 1981). Most often cited in this context are the well attested
developmental sequences in interlanguage (IL), such as those for Swedish nega
tion, English relative clauses and German word order. These sequences are
42 MICHAEL H. LONG
fixed series of overlapping stages, each characterizable by the relative frequency

of IL structures, which learners apparently have to traverse on the way to mas
tery of the target language system. (For the most comprehensive study of this
phenomenon, see Johnston 1985.)
Numerous studies show, for instance, that ESL negation, has a four-stage
sequence (for review, see Schumann 1979):
Stage Sample utterances
(1) No + X No is happy/No you pay it
(2) no/not/don't V They not working/He don't have job
(3) aux. -neg. I can't play/You mustn't do that
(4) analyzed don't I didn't see her/She doesn't live there
At stages 1 and 2, not just Spanish speakers, whose L1 has pre-verbal nega
tion, but also Japanese learners, whose native system is post-verbal, initially pro
duce pre-verbally negated utterances in ESL (Gillis and Weber 1976; Stauble
1981), although the Japanese abandon the strategy sooner (Zobl 1982). Pre-ver
bal negator placement appears to reflect strong internal pressures, for it is wide
ly observed in studies of both naturalistic and instructed SLA. Turkish speakers
receiving formal instruction, for example, start with pre-verbal negation in
Swedish, even though both L1 and L2 have post-verbal systems (Hyltenstam
1977).
With minor variations, the evidence to date suggests that the same develop
mental sequences are observed in the ILs of children and adults, of naturalistic,
instructed and mixed learners, of learners from different L1 backgrounds, and of
learners performing on different tasks. L1 differences occasionally result in ad
ditional sub-stages and swifter or slower passage through stages, but not in dis
ruption of the basic sequence by skipping stages (for review, see Ellis 1985;
Larsen-Freeman and Long, in press; Zobl 1982).
Passage through each stage, in order, appears to be unavoidable, and obli
gatoriness has been incorporated into the definition of "stage" in SLA (Meisel,
Clahsen and Pienemann 1981; Johnston 1985). As would be predicted if this de
finition is accurate, it also seems that developmental sequences are impervious
to instruction. It has repeatedly been demonstrated that morpheme accuracy or
ders and developmental sequences do not reflect instructional sequences (Light-
bown 1983; Ellis 1989), and tuition in a German SL word order structure beyond
students' current processing abilities has been shown not to result in learning
(Pienemann 1984).
The results for developmental sequences, together with related findings of
common (although not invariant) naturalistic and instructed morpheme accu
racy orders, show that language learning is obviously at least partly governed by
forces beyond a teacher's or textbook writer's control. This realization has in
turn led some theorists to conclude that classrooms are useful to the extent that
they provide sheltered linguistic environments for beginners, but that it does not
help for teachers to focus on linguistic form. An inference that could easily be
drawn from such interpretations is that there are only two options in this area of
course design: either (1) a linear, additive syllabus and methodology whose con
tent and focus is a series of isolated linguistic forms (sound contrasts, lexical
items, structures, speech acts, notions, etc.), or (2) a program with no overt focus
on linguistic forms at all. While this turns out to be a false dichotomy, focus on
form is a potentially important design feature for distinguishing instructional
methodologies and settings.
Focus on form is a feature which reveals an underlying similarity among a
variety of (a) teaching "methods", e.g. ALM, TPR, Grammar Translation and Si
lent Way, (b) syllabus types, e.g. structural, notional-functional, lexical, and (c)
program types, e.g. submersion, immersion, sheltered subject-matter, which on
the surface appear to differ greatly. Groups (a) and (b) all utilize an overt focus
on form; Group (c) does not. It also allows generalizations across traditional
boundaries, identifying a link between the program types in group (c) and in the
ory, at least, a linguistically non-isolating teaching "method", such as the Natural
Approach (Krashen and Terrell 1983). At the classroom process level, tech
niques, procedures, exercises and pedagogic tasks can also be categorized as to
whether or not they either permit or require a focus on form. Display questions,
repetition drills and error correction, for example, all overtly focus students on
form; referential questions, true/false exercises and two-way tasks do not. Fi
nally, while many potentially relevant design features will distinguish some
methods, syllabi, tasks and tests from others, few have the valency of focus on
form. It appears to be a parameter one value or another of which characterizes
almost all language teaching options.
Five caveats are in order. First, it is not being suggested that whether or not
a program type, syllabus, method, task or test focuses on form is the only rele
vant design characteristic or that important differences will not exist among
members of groups which share the feature, and vice versa. Second, while most
programs, syllabi, methods, tasks and tests either do or do not overtly focus on
form, some within the former group differ in the degree to which they isolate lin
guistic structures, not to mention as to how they do so; there are, in other words,
44 MICHAEL H. LONG
relative as well as absolute, within-group as well as inter-group, differences.

Third, it is likely that students will often focus on form when teachers or materi
als designers intend them not to, and ignore form when they are supposed to
concentrate on it. Fourth, some degree of awareness of form and a focus on
meaning may not be mutually exclusive on some tasks (for review, see Schmidt
1990). Fifth, the fact that the distinction can be made does not mean that it
should; whether it is important is a theoretical and/or an empirical matter.
3 Focus on form: a psycholinguistic rationale
The practice of isolating linguistic items, teaching and testing them one at a
time, was originally motivated by advances in behaviorist psychology and struc
turalist linguistics. Combined with the advent of a world war and a sudden need
for fluent foreign language speakers, these events led to the growth of ALM and
its many progeny. As distinct from a focus on form, to which we return below,
structural syllabi, ALM, and variants thereof involve a focus on forms. That is to
say, the content of the syllabus and of lessons based on it is the linguistic items
themselves (structures, notions, lexical items, etc.); a lesson is designed to teach
"the past continuous", "requesting" and so on, nothing else.
Arguments abound against making isolated linguistic structures the content
of a FL course, that is, against a focus on forms. Of the hundreds of studies of in-
terlanguage (IL) development now completed, not one shows either tutored or
naturalistic learners developing proficiency one linguistic item at a time. On the
contrary, all reveal complex, gradual and inter-related developmental paths for
grammatical subsystems, such as auxiliary and negation in ESL (Stauble 1981;
Kelley 1983), and copula and word order in GSL (Meisel, Clahsen and Piene-
mann 1981). Moreover, development is not unidirectional; omission/suppliance
of forms fluctuates, as does accuracy of suppliance.
Although most syllabi and methods assume the opposite, learners do not
move from ignorance of a form to mastery of it in one step, as is attested by the
very existence of developmental sequences like that for ESL negation. Typically,
when a form first appears in a learner's IL, it is used in a non-target-like manner,
and only gradually improves in accuracy of use. It sometimes shifts in function
over time as other new (target-like and non-target-like) forms enter (Huebner
1983). It quite often declines in accuracy or even temporarily disappears al
together due to a change elsewhere in the IL (see, e.g. Meisel, Clahsen and
Pienemann 1981; Huebner 1983; Lightbown 1983; Neumann 1977), a phenome
non sometimes describable as U-shaped behavior (Kellerman 1985). Further,
attempts to teach isolated items one at a time fail unless the structure happens
to be one the learner can process and so is psycholinguistically ready to acquire.

In Pienemann's (1984) terminology, learnability determines teachability. Finally,
as language teachers, employers and learners alike will attest, there is a great
difference between structural knowledge of a language, when that is achieved,
and ability to use that knowledge to communicative effect.
As noted earlier, facts about IL development like these have led some to
advocate that teachers abandon not just a focus on forms, but a focus on form,
i.e. any attention to language as object, as well. Flaws in this reasoning are ob
vious. Further, reviews of studies of the effects of instruction on IL development
(Harley 1988; Long 1988) find clear evidence of some beneficial effects of a
focus on form, and suggestive evidence of others. Briefly, while it is true that in
struction does not seem capable of altering sequences of development, it does
appear to offer three other advantages over either naturalistic SLA or classroom
instruction with no focus on form. (1) It speeds up the rate of learning (for re
view, see Long 1983). (2) It affects acquisition processes in ways possibly benefi
cial to long-term accuracy (Lightbown 1983; Pica 1983). And most crucially, on
the basis of preliminary data, (3) it appears to raise the ultimate level of attain-
ment. Further, as White (1987, 1989) has argued, incomprehensible input and
drawing learners' attention to inadmissable constructions in the L2 (two kinds of
negative evidence) may be necessary when learning from positive evidence
alone will be inadequate. To illustrate, anL1may allow placement of adverbs of
manner more flexibly than an L2. "He drinks every day coffee" and "He drinks
coffee every day" are both acceptable in French, for example, but not in English.
Both will be communicatively effective in English, however, with the result that
the French learner of English (but not the English learner of French) will need
negative input (e.g. error correction) on this point.
Whereas the content of lessons with a focus on forms is the forms them
selves, a syllabus with a focus on form teaches something else — biology, mathe
matics, workshop practice, automobile repair, the geography of a country where
Figure 1. Noun phrase accessibility hierarchy
least marked1. subject (The man that stole the car...)

2. direct object (The man that the police arrested...)
3. indirect object (The car that he paid nothing for...)
4. object of a preposition (The man that he spoke to...)
5. possessive/genitive (The man whose...)
most marked 6. object of a comparative (The man that Joe is older than...)
46 MICHAEL H. LONG
the foreign language is spoken, the cultures of its speakers, and so on — and
overtly draw students' attention to linguistic elements as they arise incidentally
in lessons whose overriding focus is on meaning, or communication. Views
about how to achieve this vary. One proposal is for lessons to be briefly "inter
rupted" by teachers when they notice students making errors which are (1) syste
matic, (2) pervasive and (3) remediable. The linguistic feature is brought to
learners' attention in any way appropriate to the students' age, proficiency level,
etc. before the class returns to whatever pedagogic task they were working on
when the interruption occurred. (For details and a rationale, see Crookes and
Long 1987; Long, in press).
An example of the probable effect of instruction on ultimate attainment
comes from work on the acquisition of relative clauses in a SL. Several studies
(e.g., for English: Gass 1982; Gass and Ard 1980; Pavesi 1986; Eckman, Bell and
Nelson 1988; for Swedish: Hyltenstam 1984) have shown that both naturalistic
and instructed acquirers develop relative clauses in the order predictable from
the noun phrase accessibility hierarchy (Keenan and Comrie 1977; Comrie and
Keenan 1979; see Figure 1), although with occasional reversals of levels 5 and 6.
Of particular interest in the present context, Pavesi (1986) compared
relative clause formation by instructed and naturalistic acquirers. The former
were 48 Italian high school students, ages 14-18, who had received from 2 to 7
years (an average of 4 years) of grammar-based EFL instruction and who had
had minimal or (in 45 of 48 cases) no informal exposure to English. The untu
tored learners were 38 Italian workers (mostly restaurant waiters), ages 19-50,
who had lived in Scotland anywhere from 3 months to 25 years (an average of 6
years), with considerable exposure to English at home and at work, but who had
received minimal (usually no) formal English instruction.
Relative clause constructions were elicited using a set of numbered pictures
and question prompts: ("Number 7 is the girl who is running", and so on). Impli-
cational scaling showed that both groups' developmental sequences correlated
significantly with the noun phrase accessibility hierarchy. There were two other
kinds of differences, however. First, naturalistic learners produced statistically
significantly more full nominal copies than the instructed learners (e.g. "Num
ber 4 is the woman who the cat is looking at the woman"), whereas instructed
learners produced more pronominal copies ("Number 4 is the woman who the
cat is looking at her"). Given that neither English nor Italian allow copies of
either kind, this is further evidence of the at least partial autonomy of IL syntax,
a claim also supported by the developmental sequence itself, of course. Interes
tingly, the relative frequencies of the different kinds of copies suggest that the
instructed learners had "grammaticized" more, even in the errors they made, a
result consistent with findings by Pica (1983) and Lightbown (1983). Second,
more instructed learners reached 80 percent criterion on all of the five lowest
NP categories in the hierarchy, with differences attaining statistical significance
at the second lowest (genitive) level and falling just short (p < .06) at the lowest
(object of a comparative) level. More instructed learners (and very few natural
istic acquirers) were able to relativize out of the more marked NPs in the hier
archy. In considerably less average time, that is, instructed learners had reached
higher levels of attainment.
Pavesi's study is a non-equivalent control groups design, so causal claims
are precluded. There are also no data on whether or not the high school students
were ever actually taught relative clauses, or if so, which ones. We know simply
that they received something like a grammar-translation course. The findings
are nonetheless suggestive of the kind of effects a focus on form may have on ul
timate SL attainment. Two other studies, furthermore, have shown that structu
rally focused teaching of relative clause formation can accelerate learning, and
also that, at least as far down as level 4 (object of a preposition) in the hierarchy,
instruction in a more marked structure will generalize back up the implicational
scale to less marked structures (Gass 1982; Eckman et al 1988; and see also
Zobl 1985).
SLA research findings like those briefly described here would seem to sup
port two conclusions. (1) Instruction built around a focus on forms is counter
productive. (2) Instruction which encourages a systematic, non-interfering focus
on form produces a faster rate of learning and (probably) higher levels of ulti
mate SL attainment than instruction with no focus on form. If correct, this would
make [ + focus on form] a desirable design feature of FL instruction. Programs
exist which have this feature, alternating in some principled way between a focus
on meaning and a focus on form. (One example is task-based language teaching.
See Long 1985; Crookes and Long 1987; Long and Crookes 1989; Long, in
press). Programs with a focus on form need to be compared in carefully control
led studies with programs with a focus on forms and with (e.g. Natural Ap
proach) programs with no overt focus on form.
4 Further research
True experiments are needed which compare rate of learning and ultimate
level of attainment after one of three programs: focus on forms, focus on form,
and focus on communication. Preliminary research in this area has produced
mixed results, two studies finding positive relationships between the amount of
class time given to a focus on forms and various proficiency measures (McDo
nald, Stone and Yates 1977, for ESL; Mitchell, Parkinson and Johnstone 1981,
48 MICHAEL H. LONG
for French FL), and a third study of ESL (Spada 1986, 1987) finding no such ef
fects. (For detailed review, see Chaudron 1988.) All three studies were compari
sons of intact groups which differed in degree of focus on forms, it should be
noted. Research has yet to be conducted comparing the unique program types.
Studies of this kind should be true experiments, employing a pretest/post-
test control group design, and should also include a process component to moni
tor implementation of the three distinct treatments. They should utilize multiple
outcome measures, some focusing on accuracy, some on communicative ability
or fluency, thereby avoiding (supposed) bias in favour of one program of an
other. The post-tests should include immediate and delayed measures, since at
least one study (Harley 1989) has found a short-term advantage for students re
ceiving form-focused instruction disappeared (three months) later. Some of the
measures should further reflect known developmental sequences and patterns of
variation in ILs, appropriate for the developmental stages of the subjects as re
vealed on the pretests. A distinction should be maintained between construc
tions which are in principle learnable from positive instantiation in the input and
constructions which in principle require negative evidence. (For further details
and desirable characteristics of such studies, see Long 1984, forthcoming; Lar-
sen-Freeman and Long 1989.)
Several additional issues need to be addressed, either as separate studies of
the focus on form design feature or as sub-parts of the basic study outlined
above. Many interesting questions remain unanswered, after all. It will be useful
to ascertain which structures require focus and/or negative evidence, and which
can be left to the care of "natural processes" (White 1987). Other possibilities
include studies motivated by implicational markedness relationships designed to
determine the principles governing maximal generalizability of instruction (see,
e.g. Eckman et al 1988). Similarly, one can envisage studies inspired by current
models of UG designed to test the claimed potential of certain structures to trig
ger instantaneous (re-)setting of a parameter. An example would be Chomsky's
(1981) work on the pro-drop parameter, and the claimed triggering effects of ex
pletives with it and there as dummy subjects (Hyams 1983; Hilles 1986). Finally,
further theoretically motivated work, like that of Pienemann (1984) and Piene-
mann and Johnston (1987), is clearly needed on the timing of instruction. Re
search of these and other kinds will establish the validity and scope of focus on
form as a design feature in language teaching methodology.
References
Allwright, R.L. 1977. "Language learning through communication practice." ELT Docs 76/3.2-14.
Breen, M.P. 1987. "Contemporary paradigms in syllabus design." Language Teaching 20/2.81-92,
and 20/3.157-174.
Chaudron, C. 1988. Second Language Classrooms. Research on Teaching and Learning. Cam-
bridge: Cambridge University Press.
Chomsky, N. 1981. Lectures on Government and Binding. Dordrecht: Foris.
Comrie, B. and EX. Keenan. 1979. "Noun phrase accessibility revisited". Language 55.649-664.
Corder, S.P. 1967. "The significance of learners' errors." International Review of Applied Linguis-
tics 5.161-170.
Crookes, G. 1986. Task classification: a cross-disciplinary review ( = Technical Report, 4.) Honolu-
lu: Center for Second Language Classroom Research, Social Science Research Institute,
University of Hawaii at Manoa.
Crookes, G. and M.H. Long. 1987. "Task-based language teaching. A brief report. Modern Eng-
lish Teaching (Part 1) 8.26-28 + 61, and (Part 2) 9.20-23.
Dinsmore, D. 1985. "Waiting for Godot in the EFL classroom." ELT Journal 39.225-234.
Dulay, M. and H. Dulay. 1973. "Should we teach children syntax? Language Learning 24/2.245-
258.
Eckman, F.R., L. Bell and D. Nelson. 1988. "On the generalization of relative clause instruction in
the acquisition of English as a second language." Applied Linguistics 9/1.1-20.
Ellis, R. 1984. "The role of instruction in second language acquisition." Language Learning in For-
mal and Informal Contexts ed. by D.M. Singleton and D.G. Little, 19-37. Dublin: IRAAL.
Ellis, R. 1985. Understanding Second Language Acquisition. Oxford: Oxford University Press.
Ellis, R. 1989. "Are classroom and naturalistic acquisition the same? A study of the classroom ac-
quisition of German word order rules." Studies in Second language Acquisition 11/3.305-328.
Felix, S.W. 1981. "The effect of formal instruction on second language acquisition." Language
Learning 31/1.87-112.
Gass, S.M. 1982. "From theory to practice." On TESOL '81 ed. by M. Hines and W. Rutherford,
129-139. Washington, DC: TESOL.
Gass, S.M. and J. Ard. 1980. "L2 data: their relevance for language universals." TESOL Quarterly
14/4.443-452.
Gillis, M. and R. Weber. 1976. "The emergence of sentence modalities in the English of japanese-
speaking children." Language Learning 26/1.77-94.
Harley, B. 1988. "Effects of instruction on SLA: issues and evidence." Annual Review of Applied
Linguistics 9.165-178.
Harley, B. 1989. "Functional grammar in French immersion: a classroom experiment." Applied
Linguistics 10/3.331-359.
Hilles, S. 1986. "Interlanguage and the pro-drop parameter." Second Language Research 2/1.33-
52.
Hoetker, J. and W.P. Ahlbrand. 1969. "The persistence of the recitation." American Educational
Research Journal 6/1.145-167.
Hyams, N. 1983. "The pro-drop parameter in child grammars." Proceedings of the West Coast
Conference on Formal Linguistics ed. by M. Barlow, D. Flickinger and M. Westcoat. Stan-
ford, CA: Stanford University, Department of Linguistics.
Hyltenstam, K. 1977. "Implicational patterns in interlanguage syntax variation." Language Learn-
ing 27/2.383-411.
Hyltenstam, K. 1984. "The use of typological markedness conditions as predictors in second lan-
guage acquisition: the case of pronominal copies in relative clauses." Second Languages. A
Cross-Linguistic Perspective ed. by R.W. Andersen, 39-58. Rowley, MA: Newbury House.
50 MICHAEL H. LONG
Johnston, M. 1985. Syntactic and morphological progressions in learner English. Canberra, Austra-
lia: Commonwealth Department of Immigration and Ethnic Affairs.
Keenan, E. and Comrie, B. 1977. "Noun phrase accessibility and universal grammar." Linguistic
Inquiry 8.63-99.
Kellerman, E. 1985. "If at first you do succeed..." Input in Second Language Acquisition ed. by S.
Gass and C. Madden, 345-353. Rowley, MA: Newbury House.
Krashen, S.D. and H.W. Seliger. 1975. "The essential contributions of formal instruction in adult
second language learning." TESOL Quarterly 9/2.173-183.
Krashen, S.D. and T. Terrell. 1983. The Natural Approach. New York: Pergamon Press.
Larsen-Freeman, D. and M.H. Long. 1989. Research Priorities in Foreign Language Learning and
Teaching. Washington, DC: Johns Hopkins University, National Foreign Language Center.
Larsen-Freeman, D. and M.H. Long. In press. An Introduction to Second Language Acquisition
Research. London: Longman.
Lightbown, P.M. 1983. "Exploring relationships between developmental and instructional sequen
ces." Classroom-Oriented Research on Second Language Acquisition ed. by H.W. Seliger and
M.H. Long, 217-243. Rowley, MA: Newbury House.
Long, M.H. 1983. "Does instruction make a difference? A review of research." TESOL Quarterly
17/3.359-382.
Long, M.H. 1984. "Process and product in ESL program evaluation." TESOL Quarterly 18/3.409-
425.
Long, M.H. 1985. "A role for instruction in second language acquisition: task-based language
teaching." Modelling and Assessing Second Language Acquisition ed. by K. Hyltenstam and
M. Pienemann, 77-99. Clevedon, Avon: Multilingual Matters.
Long, M.H. 1988. "Instructed interlanguage development." Issues in Second Language Acquisi-
tion. Multiple Perspectives ed. by L.M. Beebe, 115-141. New York: Newbury House.
Long, M.H. Forthcoming. "The design and psycholinguistic motivation of research on foreign lan
guage learning." To appear in Foreign Language Acquisition Research and the Classroom ed.
by B. Freed. Boston: D.C. Heath.
Long, M.H. In press. Task-Based Language Teaching. Oxford: Basil Blackwell.
Long, M.H., L. Adams, M. McLean and F. Castanos. 1976. "Doing things with words: verbal in
teraction in lockstep and small group classroom situations." On TESOL '76 ed. by J.F. Fan-
selow and R. Crymes, 137-153. Washington, DC: TESOL.
Long, M.H. and G. Crookes. 1989. Units of analysis in syllabus design. Ms. Department of ESL,
University of Hawaii at Manoa.
Long, M.H. and C.J. Sato. 1983. "Classroom foreigner talk discourse: forms and functions of tea
chers' questions." Classroom-Oriented Research in Second Language Acquisition ed. by H.W.
Seliger and M.H. Long, 268-285. Rowley, MA: Newbury House.
McDonald, F.J., M.K. Stone and A. Yates. 1977. The effects of classroom interaction patterns and
student characteristics on the acquisition of proficiency in English as a second language.
Princeton, NJ: Educational Testing Service.
Meisel, J.M., H. Clahsen and M. Pienemann. 1981. "On determining developmental stages in
natural second language acquisition." Studies in Second Language Acquisition 3/2.109-135.
Mitchell, R., B. Parkinson and R. Johnstone. 1981. The foreign language classroom: an observa-
tional study. ( = Stirling Educational Monographs 9.) Stirling: Department of Education,
University of Stirling.
Neumann, R. 1977. An attempt to define through error analysis an intermediate ESL level at UCLA.
M.A. in TESL thesis. Los Angeles, CA: UCLA.
Newmark, L. 1966. "How not to interfere with language learning." International Journal of Ameri-
can Linguistics 32/1.77-83.
Newmark, L. and D.A. Reibel. 1968. "Necessity and sufficiency in language learning." Interna-
tional Review ofApplied Linguistics 6.145-164.
Nunan, D. 1987. "Communicative language teaching: making it work." ELT Journal 41/2.136-145.
Pavesi, M. 1986. "Markedness, discoursal modes, and relative clause formation in a formal and an
informal context." Studies in Second Language Acquisition 8/138-55.
Phillips, D. and C. Shettlesworth. 1975. "Questions in the design and implementation of courses in
English for specialized purposes." Proceedings of the 4th International Congress of Applied
Linguistics (Volume 1) ed. by G. Nickel, 249-264. Stuttgart: Hochschule Verlag.
Pica, T. 1983. "Adult acquisition of English as a second language under different conditions of ex-
posure." Language Learning 33/4.465-497.
Pica, T. and M.H. Long. 1986. "The linguistic and conversational performance of experienced and
inexperienced teachers." "Talking to learn ": Conversation in Second Language Acquisition
ed. by R.R. Day, 85-98. Rowley, MA: Newbury House.
Pienemann, M. 1984. "Psychological constraints on the teachability of languages." Studies in Sec-
ond Language Acquisition 6/2.186-214.
Pienemann, M. and M. Johnston. 1987. "Factors influencing the development of language profi-
ciency." Applying Second Language Acquisition Research ed. by D. Nunan, 45-141. Adelaide,
SA: National Curriculum Resource Centre.
Prabhu, N.S. 1987. Second Language Pedagogy. Oxford: Oxford University Press.
Ross, S. Forthcoming. Praxis and product in the EFL classroom. To appear in Evaluating Second
Language Education Programs ed. by C. Alderson and A. Beretta. Cambridge: Cambridge
University Press.
Scherer, G. and M. Wertheimer. 1964. A Psycholinguistic Experiment in Foreign Language Teach-
ing. New York: McGraw-Hill.
Schmidt, R.W. 1990. "The role of consciousness in second language learning." Applied Linguistics
11/2.17-45.
Schumann, J.H. 1979. "The acquisition of English negation by speakers of Spanish: a review of the
literature." The Acquisition and Use of Spanish and English as First and Second Languages
ed. by R.W. Andersen, 3-32. Washington, DC: TESOL.
Smith, P. 1970.A Comparison of the Cognitive and Audiolingual Approaches to Foreign Language
Instruction: The Pennsylvania Foreign Language Project. Philadelphia: Center for Curriculum
Development.
Spada, N. 1986. "The interaction between types of content and types of instruction: some effects
on the L2 proficiency of adult learners." Studies in Second Language Acquisition 8/2.181-199.
Spada, N. 1987. "Relationships between instructional differences and learning outcomes: a pro-
cess-product study of communicative language teaching." Applied Linguistics 8.137-161.
Stauble, A.-M. 1981. A comparative study of a Spanish-English and Japanese-English second lan-
guage continuum: verb phrase morphology. Unpublished Ph.D. dissertation, UCLA.
Swaffer, J.K., K. Arens and M. Morgan. 1982. "Teacher classroom practices: redefining method
as task hierarchy." Modem Language Journal 66.24-33.
Von Elek, T. and M. Oskarsson. 1975. Comparative Methods Experiments in Foreign Language
Teaching. Department of Educational Research. Gothenburg, Sweden: Molndal School of
Education.
White, L. 1987. "Against comprehensible input: the Input Hypothesis and the development of sec-
ond-language competence." Applied Linguistics 8/2.95-110.
52 MICHAEL H. LONG
White, L. 1989. "The principle of adjacency in second language acquisition: do learners observe
the subset principle?" Paper presented at the Child Language Conference, Boston, MA.
March.
Wode, H. 1981. "Language-acquisitional universals: a unified view of language acquisition." Na-
tive Language and Foreign Language Acquisition. ( = Annals of the New York Academy of
Sciences 379) ed. by H. Winitz, 218-234. New York: New York Academy of Sciences.
Zobl, H. 1982. "A direction for contrastive analysis: the comparative study of developmental se-
quences." TESOL Quarterly 16.169-183.
Zobl, H. 1985. "Grammars in search of input and intake." Input in Second Language Acquisition
ed. by S.M. Gass and C. Madden, 329-344. Rowley, MA: Newbury House.
Pros, Cons, and Limits to Quantitative Approaches
in Foreign Language Acquisition Research
W.E. Lambert
I was asked to focus my remarks on the advantages and disadvantages of

quantitative approaches and designs in research on foreign language acquisition,
with illustrative examples. In addition, the remarks were to be pertinent not only
for scholars in this specialized field but also for everyday members of college
language faculties who have to deal with classrooms of real, live students.
Two preliminary apologies are in order. First, several of my examples will
be drawn from research on elementary or high school youngsters. Even so, I be
lieve they are nonetheless appropriate for college level educators because the
processes of teaching and learning are fundamentally common ones running
their course at all age levels. The examples are also relevant to college educators
because the new twists in early language education are affecting large numbers
of pupils who are already bringing their new experiences and competencies
along with them right up to college. For instance, the Canadian language "im
mersion" experience I will refer to has changed dramatically the more recent
waves of foreign language (FL) or second language (SL) learners who expect a
great deal more of high school and college FL education than was formerly the
case. As a consequence, the goals, means, and overall purpose of college level
courses have had to be modified to accommodate a new breed of FL or SL stu
dent.
Second, I should explain, if not apologize, for making comments at all about
quantification because actually I see matters of research design as no more than
good common sense and statistical experts as officiators or rule-keepers who
have to be tested periodically for their certainty and to be outfoxed when they
54 W.E. LAMBERT
are inattentive. But I should be qualified to talk on this topic because, in years
past, I was a statistical assistant for Leon Thurstone and have been a colleague
for years of John Carroll, George Ferguson, and Lee Cronbach, who have tried
to keep me honest, statistically speaking. Furthermore, I have been knee-deep
in quantitative research on language related issues for a long run of years. That
experience has made me a proponent of tight designs and quantitative checkouts
because all other alternatives in language research turn out to be too subjective
and personally biased. The only way I can see to be tough or rigorous on our
selves and our ideas in this field is to put those ideas to a serious quantitative,
experimental test. This bias of mine, however, has clear limits and what I want to
do here is present what I see as the pros to quantitative approaches as well as
the cons and the limitations.
Complying with this topic assignment meant reading recent reports on stat
istical procedures for dealing with performance changes over time when large-
scale evaluation studies are conducted (e.g. the papers of Willett 1988; Bryk and
Raudenbush 1987; and Rogosa, Brandt, and Zimowski 1982); reading through
several large scale, ongoing empirical studies on foreign language pedagogy in
order to get some idea of what is going on in the North American scene; and
then, thinking back on my own involvement in studies of language pedagogy and
attempting to explain what has been going on in these cases, too.
The upshot of all this is that I have three or four macro concerns that will be
the schema for organizing the comments to follow. The concerns are: 1) bigness
versus manageableness in the breadth of empirical studies; 2) the nature of
"process" in the product-versus-process debate in empirical research; and 3) the
tailoring of design and statistics to accommodate more moderately sized investi
gations that will be able to explore "deep" processes or underlying mechanisms.
1 Bigness in Language-Related Research
The United States is big and research directors as well as fund suppliers
seem to want to make their studies big, as if one can only keep up the feeling of
national unity if one brings the whole nation or some large region of it into each
empirical test of an educational or social innovation. The common argument is
that if a researcher has a really strong new pedagogical treatment in hand, or a
really important teacher or learner characteristic to examine, its effects should
be robust enough to emerge even when tested across the nation. Consequently,
it is common to hear an administrator in a federal post (e.g. in the United States
Department of Education) say that he/she has two or three somewhat related
empirical studies underway that are national in scope, each at a cost of some five
PROS, CONS, AND LIMITS TO QUANTITATIVE APPROACHES 55
million dollars. The problem I have with this is that one can convincingly argue
that there are many "nations" all within the U.S.A. For example, one recent esti
mate is that there are only 14% of the American population that have Anglo-
Saxon roots, which makes them not much more important than the 13% who
have Germanic origins, the 11% who have African roots, the 11% who have His
panic roots, and etc. (see Sowell 1983). In fact, I believe that if one were to scale
down distances in North America to a European size, it might well be that there
are equivalent culture differences in a San Diego-Albequerque-Chicago-Boston
network (to take a random example of sites), as there are in a Zurich-Milano-
Paris-Amsterdam-London network.
The point is that when research projects become too large they are forced
to overlook the socially distinctive characteristics of regional sites, school dis
tricts, schools, and particularly classrooms. To attempt to attend to these poten
tially distinctive features usually overtaxes the capacities of the research team,
and in most cases such issues are bypassed in the search for across-site trends.
Researchers usually realize that there are regional, district, and school variations
in their data that are clear and possibly significant, but they normally can't deal
with them; and this usually means that they are "averaged out". For example,
samples of pupils from various schools are amalgamated in a treatment-com
parison investigation, even though obvious school-to-school differences in "aca
demic atmosphere" exist, i.e. differences in the attention or priority given to
certain subject matters or to learning in general. My argument here is that lan
guage related research should be kept as small as possible so that regional, dis
trict, school, principal, teacher, and student variations can be dealth with
adequately. If one were to combine data collected in London and Amsterdam to
test out some particular pedagogical approach, one would likely have to over
look enormously different views about language learning in the two sites. But no
more so, I would argue, than would transpire in a Boston-Chicago amalgama
tion.
Here are two examples of bigness troubles that I have in mind. The first is
the Baker and De Kanter (1981) review of all methodologically adequate studies
of bilingual education up to the 1980's that were developed for language mi
nority children and conducted across the United States. The aim of the Baker and
De Kanter report (1981) was to assess the impact of bilingual educational offer
ings on math and English achievement scores. The basic criterion for a success
ful program was that it showed more learning than would have been the case
without the program.
Setting aside the clear need for either random-assignment controls for
those in or outside a bilingual program, or some quasi-experimental approxima
tion (e.g. Campbell and Boruch 1975), Baker and De Kanter concluded on the
56 W.E. LAMBERT
basis of the studies available that bilingual education didn't have much if any
positive effect. Overall, perhaps that is a relatively true evaluation, but "overall"
in this case covered a multitude of sins. When someone with the patience and
insight of Ann Willig (1985) conducted a meta-analysis of the same studies re
viewed by Baker and De Kanter, she was able to uncover enough of the sins to
come to a much more convincing conclusion and one that was very favorable to
ward bilingual education. For instance, Willig found: "In every instance where
there did not appear to be crucial inequalities between experimental and com
parison groups, children in the bilingual programs averaged higher than the
comparison children on criterion instruments" (1985: 312). My point here is that
much of the confusion in the overview of Baker and De Kanter was due to the
fact that they had to look beyond specific cases that were regional, district or
school specific, and it took a Willig to not only considered them but show their
importance. In her conclusions, she makes direct reference to the bigness factor:
"The cost of the national Title VII evaluation could havefinancedseveral pro
grams that included sound, integrated research in the design. Not only would
such an endeavour have produced additional programs for a number of stu
dents, it would also have produced information useful for both evaluation and
program planning. In discussing the necessity for smaller scale, randomized ex
periments of educational programs, Campbell and Erlebacher (1970: 207)
write, "We are sure that data from 400 children in such an experiment would
be far more informative than 4,000 tested by the best of quasi-experiments, to
say nothing of an ex post facto study". The results of this synthesis have con
firmed that observation" (Willig 1985:313).
My second example is a research project directed by my good friend David

Ramirez and I'm an outside advisor on this one. It is big and expensive, but it is
a very good one, in large part because the research team was dedicated, inter
ested, and coordinated (Ramirez 1988). The project was solicited by the Depart
ment of Education and was designed for hispanic American children, especially
those with "limited English proficiency" (LEP). Its purpose is to test whether
early schooling in (a) an all-English program (a type of swim or sink option that
circumvents the Spanish home language) is more effective than (b) a traditional
transitional bilingual program (wherein some, but not much,Spanish is used in
order to assist the children to re-program themselves into the all-English
stream), or (c) a quasi maintenance-of-home-language bilingual program that
provides for language arts in Spanish and instruction for part of the day through
Spanish. Option (a) is called the "Immersion Strategy" program, (b) the "Early
Exit" alternative, where "exit" means out of the program to all-English classes,
and (c) "Late Exit" to indicate much less rush to exit, with more emphasis on
helping the children juggle two languages and cultures. (Of course, we Cana
dians dislike the misuse of the term "immersion" in the first case because ac
tually it is a reversal of the intent of immersion education, as I will explain later.)
The implementation of this project has been instructive in several respects.
It was a study requested by the government, through the Department of Educa
tion (D.O.E.), and was motivated by a keen interest in the potential of the Im
mersion Strategy option. My guess is that the immersion-in-English option was
congruent with the Reagan administration's views about language minorities, i.e.
that it is basically unAmerican to have American citizens or citizens-to-be jibber
jabbering in home languages in American public schools. Things like that, the
argument goes, stigmatize minorities and slows their progress towards Ameri
canization. Better to dive right into English and stay away from "maintenance"
bilingual programs or even "traditional" ones if possible, because both alterna
tives only stretch out the assimilation period. Since a few districts around the
country were trying out an immersion-in-English option, the D.O.E. indicated in
the original contract that they wanted a large scale investigation of the Immer
sion Strategy classes then starting, and to have them compared with "com
parable" Early Exit programs — the most common form of education available
for minorities. The study would be restricted to hispanic children only. Thus, the
study was to be big and also expensive because it would follow children for a
four year period. A small group of experienced research consultants would meet
twice a year first to help design the study and then to monitor it.
At the first design-planning meeting, the consultants argued fast and furi
ously to change the immersion name to "sink or swim", "submersion", "drown
ing", "brain washing" or some such alternative. But the D.O.E. kept it as
"immersion strategy". Then we argued for the inclusion of one alternative — the
Late Exit option — to add a bit of sunshine to the project and on this point, the
D.O.E. was persuaded our way.
Then much time was spent on another really exciting approach to the basic
question, and actually this alternative almost worked out. The idea was to run a
real experiment comparing the three alternatives. We realized that few district
supervisors in the United States could differentiate one of these alternatives
from another. In fact, few school principals or teachers in bilingual/bicultural
programs in the United State know about alternative approaches to teaching mi
nority children, other than the alternative they have been asked to comply with.
Thus, we had the opportunity to work with one or two districts and to set up,
through random placement of pupils, the three alternatives and test their
relative effectiveness. Parents certainly were no better informed about the alter
natives and they would likely have been willing to participate, since, as is, they
take whatever program the district has decided to offer. This possibility got us
58 W.E. LAMBERT
researchers excited because it would have satisfied the basic demands of "good"
experiments according to Donald Campbell's and Ann Willig's specifications.
Note that it would have been relatively small in scale, providing control over dis
trict and region effects, the things that a bigger study can't handle properly. As
well, treatment specifications and teacher selection and training could have
been easily undertaken and monitored, and most important of all, pupils from a
common district or community could have been placed at random in one treat
ment or another. Later, contrasting communities could be included. But rather
than crying over a missed opportunity, we as consultants could adjust to the
generosity of D.O.E. to permit the Late Exit alternative to be added to the con-
tract.
The main point here, however, is that bigness stalks this project, as it does
so many American educational evaluations or surveys. Let me illustrate. (1) In
order to get a nation-wide view of the relative strengths and weaknesses of the 3
options, 5 states are included: California, Texas, Florida, New York, and New
Jersey (likely equivalents to Moscow, Athens, Bucarest, Amsterdam, and Lon
don, in my mind). This spread, it was argued, would give representation to the
major Spanish-speaking groups in the United States. But this approach means
that little or no attention can be given to the differences in program effects for
the various cultural-historical subgroups classified together here as Hispanic, i.e.
Mexican and Chicano in particular regions and Cuban and Puerto Rican in other
regions. (2) Some states have only one or two of the alternatives in operation in
their schools and few districts or school systems available across the country
have all three alternatives in place. This means that the researcher can not
determine why a particular state, region or district has "inherited" one alterna
tive or another, and what effect the values and attitudes underlying the choice of
alternative in vogue might have on that program's relative success or failure. (3)
States, regions, districts, and schools within a district vary also with respect to
the socio-economic and educational background of Hispanic families and these
factors affect, salary and social class backgrounds of public school teachers, and
ultimately the achievement scores of pupils. In sum, then, this large scale re
search project, exemplary in many respects and so designed as to circumvent as
many of these potentially confounding variables as possible, has been from its
start too big for its britches. It has had to overlook or work around variables that
are clearly socially significant, i.e. ethnic differences within American's Spanish-
speaking population; state and regional variations in socio-economic status of
families, in demographic clusters of language minorities, and in educational pro
grams; and school-specific variations in climates or atmospheres that encourage
or discourage learning and teaching. My argument is that less money would be
spent to conduct a coordinated set of real or quasi experiments in different re-
gional sites, permitting researchers to concentrate on a manageable subset of

sites for separate studies that could be kept small enough to allow researchers to
give attention to as many socially relevant factors as possible. Smaller, multiple-
site approaches to a national issue of this sort could provide a type of "construct
validity" checkout on each of the alternatives. Then university researchers and
student assistants who know about each site could be involved, thereby making
the project less expensive, and testers could be drawn from the various language
minorities in the local communities.
2 Product Research Versus Process Research: A Question of Depth
Michael Long (1984) has recently described some important differences be
tween process-oriented research and product-oriented research, a differentia
tion similar in many respects to "formative" versus "summative" evaluations in
research on second language learning (see Scriven 1967). Product research sets
out to answer questions about the effectiveness, reflected in achievement scores,
of one program (approach or "treatment") compared to another program.
Generally each new educational innovation is ultimately tested for its presumed
merits by means of product evaluations, and, theoretically, well conducted
evaluations could test out and, thereby, inform policy makers on the best course
of education possible. One need never know why a particular approach is the
best alternative if one could be confident that the evaluation had been carefully
conducted: the product outcomes would simply determine which alternative is
the most effective. But it is difficult for human researchers to be careful enough
to satisfy all possible critics, and, especially in big studies, unbeknownst to the
evaluator, happenings intervene while the product is being tested. For example,
pupils following an Early Exit option in the example above might perform
poorest at the time of post-testing because all the good ones in that program had
been exited out (the issue of subject "mortality"). Researchers might have big
enough samples to still deal with the "slow" early exiters, but one would begin to
see weaknesses in the evaluation. Or it could be that the supposedly "bilingual
instruction" given to one treatment group was actually reduced to having a bi-
cultural teacher instruct through English, possibly with non-native command of
the English language. Consequently, researchers are required to be as "process
oriented" in their evaluations as possible, that is, to find out what actually tran
spires in each classroom under each treatment. This includes the details of
teacher-pupil interactions, analyses of the content of instruction and its form, as
well as the pupil variability in receptivity to the instruction. Clearly, there is a
need for some balance here, as Long recognizes:
60 W.E. LAMBERT
"Process evaluations offer many benefits for teachers and administrators alike.
Of these, the most important is that they can document what is going on in
classrooms, as opposed to what is thought to be going on. Using process and
product evaluations in combination, one can then determine not only whether
a program really works, or works better, but if so, why, and if not, why not"
(Long 1984: 422).
The examples Long gives of what can be done in process research are in
structive and interesting. Consider the issue of teacher-pupil interactions. Sup
pose it is agreed to video-tape or audio-tape a sample of classes in an
educational experiment on language pedagogy; some samples would be taken
from an innovative, new approach in one case and from a standard old approach
in the other. The tapes are transcribed, and transcriptions usually take five
minutes per minute of tape. To check on transcription accuracy one might want
two transciptors to work independently and' calculate their agreement; but note
how costs can accumulate here. Nonetheless, information about the fine-tex
tured differences between programs can be made apparent in this fashion, e.g.
one set of classes might be found to stress:
"1) structural grading, 2) immediate, forced oral production by students, 3)

avoidance and correction of errors —focus on form, 4) both mechanical and
meaningful language practice, chiefly through memorization of short dia
logues built around basic sentence patterns, and 5) large doses of drillwork"
(Long 1984: 416).
There is no question that researchers would be delighted with such data, be
cause then they could pinpoint factors that have an effect on product-oriented
achievement measures. There is good common sense here that researchers ap
preciate. For instance, Merrill Swain (1987) got such transcripts from classes in
French immersion programs in Canada and found that teachers hardly ever used
the past tense when teaching a history course in French to anglophone students.
And product assessments had noted that these students were not too swift in the
use of the French past tenses! The major point is that researchers like Swain in
this example might miss entirely what was going on in the classroom if they ne
glect process concerns in their research.
Process, however, can be overstressed, and clearly a reasonable balance has
to be struck. Here's an example of too much process at the expense of product,
an example that bothers me. The researchers were searching for "significant bil
ingual instructional features", i.e. they attempted to "identify, describe, and ana
lyze significant instructional features in successful bilingual instructional
settings" and to explore the consequences of these features on the progress of
language minority pupils (Tikunoff 1980: 1, 1981; Fisher and Guthrie 1983). For
this purpose they collected detailed information on what went on in the class-
rooms, including teaching styles, whether active or not, among many other fea
tures, assuring the reader of their report that process aspects of research were
admirably covered. The trouble is that, for determining which programs were
"successful", they relied on opinions of local people — administrators, teachers,
parents, and former students. No other independent check on "success" is men
tioned and no attention is given to a contrast or comparison group that did not
receive "significant bilingual instruction". In fact, "significant" is presumed to be
the instruction that transpires in "successful" classrooms. Again, there is no in
troduction of comparison groups who were not successful. There is much valu
able information in this work. But it was expensive and spread through three
years, and there is no way to determine and no evidence given to convince me
that these instructional features were either significant or successful. The ne
glect of product information in this case means that the researchers did not go
after data from matched groups of Limited English Proficient students who re
ceived either one set of instructional features or a comparison set, and who then
were found to be either successful or not in terms of achievement growth or im
provement. To me, it is a shame to have missed this opportunity to, as Long sug
gests, combine product and process concerns in the research. A valuable
suggestion for those wanting to explore the process-product issue in more detail
is the recent work by Craig Chaudron (1988) that demonstrates very nicely the
need for researchers to give ample attention to both process and product.
3 Deeper Forms of Process
There is, however, another way to consider process and product research, a
way that I think goes deeper and captures the interests of another type of re
searcher. The particular other way I have in mind was introduced by Lev Vygot-
sky back in 1934, although his book appeared in English only in 1962 (Vygotsky
1962). Here's an example:
"It seems to us that [this] phenomenon has not received a sufficiently convinc
ing psychological explanation, and this for two reasons: First, investigations
have tended to focus on the contents of the phenomenon and to ignore the
mental operations involved, i.e. to study the product rather than the process;
second, no adequate attempts have been made to view the phenomenon in the
context of other bonds and relationships..." (Vygotsky 1962: 71; emphasis
added by W.L.).
62 W.E. LAMBERT
The phenomenon Vygotsky was referring to was the changes that transpire
in the normal development of thought from infancy to young adulthood, a pro
gression from thinking in "complexes" to "pseudo-concepts" or "potential" con
cepts to "genuine concepts".
"The processes leading to concept formation develop along two main lines.
The first is 'complex' formation: The child unites diverse objects in groups
under a common 'family name'; this process passes through various stages.
The second line of development is the formation of 'potential concepts', based
on singling out certain common attributes. In both, the use of the word is an
integral part of the developing processes, and the word maintains its guiding
function in the formation of genuine concepts, to which these processes lead"
(Vygotsky 1962: 81).
This endeavour permitted Vygotsky to be:
"the first modern psychologist to suggest the mechanism by which culture

becomes part of each person's nature. The internalization of socially rooted
and historically developed activities is the distinguishing feature of human psy
chology" (Cole 1978: 6+57).
To study these "processes" and to find a potential "mechanism" of the sort

referred to here led Vygotsky not only to conduct experiments, using the now fa
mous Vygotsky blocks (see Vygotsky 1962: 52-81), but to experiment with child
ren who fit somewhere on a continuum of developmental age steps. Vygotsky
was interested in how children of different ages performed (the product dimen
sion) and how each child in each age group interacted with the experimenter, in
terms of the details of what was said by each member of the dyad and what each
member meant by what was said (the conventional process concern). More im
portant, he was also interested in the mental operations involved in each at
tempted solution of the problems presented. It is this last step in Vygotsky's
overall approach that I see as a deeper form of process research, a form that
could be a helpful model for research on foreign language learning. Standard
process research can make us aware of what is going on in a classroom and it can
help us be certain that the planned treatment offered to pupils in that classroom
is or is not transpiring (the "treatment verification" function of process research
referred to earlier). For me, the more fundamental processes in foreign lan
guage learning are those that take place in students' minds and in the social sys
tems students find themselves in, rather than the processes transpiring in
classrooms or in the teacher-student interactions. The only way I see to get at
these more social-psychological processes is through a combined product-pro-
cess orientation on the part of the researcher. But to get at the deeper levels, the
researcher has to have some relevant theoretical ideas, even if only common-
sense hunches, to orient the long-ranged plan of the research. Let me illustrate
what I mean through three or four examples from the Montreal setting.
4 Illustration # 1: Two Solitudes
In Montreal, French and English school systems are and have been separ
ate; the administration is separate, the schools are in different sites and, conse
quently, students and staff are kept exclusively in their own linguistic worlds.
This separateness is nicely represented in an important Canadian novel on the
two major etholinguistic communities in Quebec, entitled Two Solitudes
(McClennan 1945). Ailie Cleghorn and Fred Genesee (1984) were interested in
what happens when French and English speaking teachers become members of
a common teaching staff in English language schools that have French "immer
sion programs" underway. Their hunch was that the social interactions of the
two groups of teachers would likely reflect the social realities of distant, separate
existences. Data were collected, using observational procedures, over a one-year
period. Thus, an observer recorded relevant events in the schools, in classrooms,
in principals' offices, and in teachers' rooms, especially at break times and lunch
times. It was an unusual event, for both the French teachers and the English
schools involved, to have a sizable subgroup of French teachers working in
otherwise all-English schools. At first, the English speaking teachers showed
normal amounts of politeness and welcome. In the common teachers' room at
lunch period, for example, small tables were arranged so as to accommodate all
staff, and suggestions were made that French might be the language of com
munication (a type of "French table") from time to time, so that the English tea
chers could get some experience using French, at the same time, they reasoned,
as the French teachers were made to feel at home. The Cleghorn and Genesee
study is noteworthy because it chronicles in the teacher-to-teacher contacts the
slow but sure emergence of the deep, long-standing conflictual nature of Eng
lish-French relations in the general society. For instance, there was a gradual
separation and segregation of social contacts, including the use of separate ta
bles, separate burners on the common stove, schedules for French and English
usage of the stove. French teachers slowly switched to English (no matter how
poorly they commanded it) for intergroup contacts which, for generations, had
been the expected thing for French-Canadians to do in the presence of anglo
phones. To me, this informative study is a good example of a carefully do
cumented, standard process-oriented study that was designed to go far beyond
64 W.E. LAMBERT
the structure and content of the interaction between teachers. Instead, the basic
process data were used to explore a fundamental social-context process invol
ving society's impact on the school and on cross-group contacts that take place
in this novel form of mixed-group setting. The impact of this deeper societal
process on anglophone children's progress in French, their reluctance to initiate
French conversations outside school, and their expectations that French people
speak English with anglophones were all evident in the product results of the
immersion classes.
5 Illustration # 2: Bilingualism's Effects on Creative-Type Thinking
This is an example of a research project that accompanied a standard pro

duct-oriented evaluation of the progress of anglophone students enrolled in
French immersion programs (see Scott 1973). The basic idea was to see if "be
coming" bilingual would expand or enhance "mental flexibility", an idea that
other research (e.g. Peal and Lambert 1962) had suggested as a possibility be
cause a very pronounced association had been found between certain aspects of
IQ and bilingualism. An opportunity was seized on to explore the causal direc
tion of this association, especially the possibility that becoming bilingual causes
an increase in cognitive flexibility.
By the early 1970's, the structure of French immersion programs in schools
had been routinized in terms of agenda and procedure. Consequently, it was
possible to test a group of anglophone children on a measure of cognitive flexi
bility (in this case, a measure of "divergent thinking" taken from the work of
Getzels 1962) at the kindergarten and grade 1 levels, before the youngsters were
launched on immersion, and again at the grade 5 and 6 level by which time a
good degree of functional bilingualism was already evident. Anglophone control
groups who followed a conventional all-English program served as an appropri
ate comparison since most of the parents of these pupils would also have taken
the immersion option, had it been available to them. The controls, however,
were selected to be of equivalent IQ's and socio-economic backgrounds as the
treatment (i.e. immersion) groups. Thus, we were confident that at the start of
elementary school, the treatment and control groups were as alike as one can
ever get them and their early scores on the divergent thinking tests were essen
tially identical. The important finding, however, was that the later scores at the
end of grades 5 and 6 were significantly different in favour of the immersion
group, even when tested via English which was used much less than 50 percent
of the time in the schooling of the immersion student.
This outcome not only says something important about the causal link be
tween becoming bilingual and cognitiveflexibility,but it also casts light on a very
important underlying mental process that permits one to infer what likely goes
on in the immersion experience, something far below the surface events of
teacher-student interaction patterns. Thus, this study is a clear illustration of a
Vygotsky-type process that was studied through an apparently product type (pre-
post) testing of the performance of the children on a standard, psychometrically
sound measure of cognitive activity.
6 Illustration # 3: The Effects of Attitudes and Motivation on Foreign

Language Study
Robert Gardner and I have had a long-standing interest in the role played
by students' attitudes towards the foreign group whose language they are stu
dying, whether they are motivated by "instrumental" reasons (those with a prac
tical pay-off) or "integrative" reasons (e.g. interest in or inquisitiveness about
the foreign people and their culture) (see Gardner and Lambert 1972). Since
our early work, Gardner (1981) has accumulated an impressive array of empiri
cal studies that explore the ways in which attitudes and motivations affect lan
guage acquisition proficiency, performance in the classroom, and willingness to
take advanced courses in the language. The basic research design is to measure,
as of the start of FL training, the foreign language aptitude, the verbal IQ, the
socioeconomic background, and the attitude-motivational profile of large num
bers of primary and secondary school students and to follow them through one
or more years of FL training, with repeated tests of FL achievement. Thus, a ba
sically product oriented approach is followed. Numerous, small-scale replica
tions reveal that measures of attitudes toward the other cultural group and
motivational interest in mastering the FL are correlated, forming a cluster that
stands apart from a second cluster made up of tests of aptitude for learning a FL
and verbal intelligence. Furthermore, each cluster is as closely correlated to FL
achievement as the other. The fact that the attitude-motivation cluster is as good
a predictor of FL achievement as verbal intelligence or language aptitude and
that it is statistically independent from the aptitude-intelligence cluster has great
social significance because it indicates that anyone, even the intellectually and
linguistically non-gifted, can be successful in FL study if they want to and espe
cially if they want to for the "right" attitudinal reasons.
The more recent research of Gardner and his students shows that the atti
tude-motivation index is also strongly associated with perseverance in the FL
study (Gardner and Smythe 1975; Gardner 1981), that is, the more integratively
66 W.E. LAMBERT
oriented the attitudes and motivation of student are, the more they avail them
selves of opportunities to practice the second language, and the more often they
decide to take advanced level courses at the college level. It is also clear that at
titudes and motivation affect classroom interactions (Glicksman 1981; Gardner
1981). Trained observers of FL classrooms found that the more "integratively"
oriented students (those with favorable attitudes and non-practical motivations)
volunteered more frequently, gave more correct answers publicly, and received
more positive feedback from teachers than did those less integratively oriented.
There were no subgroup differences, however, in asking the teacher questions,
in demonstrating knowledge beyond that solicited, nor in indications of class-
room anxiety.
For me, these results indicate that a deeper process, reflected in an atti
tude-motivation complex, is at work in FL learning. Furthermore, this deeper
process seems to have an effect on the content and structure of the teacher-stu
dent interaction — the more standard form of classroom process, the type more
commonly dealt with by FL researchers.
7 Illustration #4: Processes Underlying the Transformation of Subtractive to

Additive Forms of Bilingualism
My final example is both societal and personal in nature. It deals with small
communities in northern New England in the United States whose residents
have French as a heritage language, being third or fourth generation immigrants
from French Canada, but who function otherwise in an all-English American so
ciety. These "Franco-Americans" have kept French up mainly as an informal so
cial language, especially with family members, and mainly for oral
communication; there is very little reading or writing in French. As these
families function more and more in English, they gradually lose French. Their
stage of bilinguality reflects the gradual substitution of English for French, what
we refer to as "subtractive" bilingualism, meaning that even though at a certain
time in their lives they are functionally bilingual, French is being eliminated
from their lives and replaced by English (Dube and Herbert 1975; Lambert,
Giles, and Picard 1975; Lambert, Giles, and Albert 1976). The implied contrast
is with an "additive" form of bilingualism where speakers of a dominant, prestig
ious and communicationally useful language (like English in the United States
or French in France) can add a second or foreign language to their linguistic
repertoires with no fear that the first language and its cultural supports will be
upset in any sense. Rather, they experience numerous cognitive, intellectual and
social advantages as they become bilingual. The question that prompted us was:
Can researchers successfully effect a change at the school/community level that

will transform a subtractive bilingual experience into an additive one?
Working as research collaborators of school administrators in Madawaska,
Frenchville, and Fort Kent, small communities in northern Maine, we selected
at random a subset of elementary school classrooms, and assigned available bil
ingual teachers to teach half the day in French and half in English. They were to
follow the conventional academic curriculum, supplementing English textbooks
with French ones from France or Quebec or with French mimeographed materi
als covering curriculum content. The families were mainly from working- or
lower-middle socioeconomic standing, and nearly 90% of the children had some
audio-lingual skill with French at home and in social settings. The random selec
tion of classrooms and pupils provided us with two essentially similar groups of
Franco-American youngsters: the treatment group received a four-to-five year
experience with 50% of their instruction in French (which meant that they had
only 50% of school time spent in English instruction) while the control group
followed a conventional all-English program. Both groups had teachers who
were from the region and all of these were also Franco-American.
The results revealed: a) progressive improvement in French skills (writing
and reading as well as audio-lingual) for the treatment group, as expected; 2) the
same, and in several respects better, scores on standardized measures of English
skills for the immersion group over the controls; and 3) the same or better
achievement than the controls in subject matters taught through French (like
math and social science), even when tested in English.
How can one explain these outcomes? The explanation I find most reason
able is that the fate of language minority children in public schooling can be sub
stantially improved if they are given a chance to study and learn through their
heritage languages. Here we had apparently successfully transformed a subtrac
tive bilingual experience into an additive one, and our guess was that some
deeper underlying process was likely a key mediator of these favorable out
comes. More specifically, we had hypothesized that a sense of pride in having a
French heritage and a sense of value attached to the French language were like
ly involved (see Lambert 1984). This prompted us to administer pre-post tests,
for both the treatment and control children, of self views and of evaluations of
heritage culture and language. Statistical comparisons revealed a significant dif
ference, favoring the treatment group of pupils, who, in contrast to the controls,
were proud and happy to be both American and French and who were pleasantly
surprised and equally proud that French was as useful and precise a language for
school learning as was English-a set of ideas the control children had no way to
develop.
68 W.E. LAMBERT
This example, I suggest, is both small and community based, and it is by de
sign as carefully control-group, product-oriented as we could make it, and yet it
was much more. It provided us with an opportunity to test out potentially im
portant underlying processes that help us understand the different meanings that
being bilingual/bicultural can have on both language minority and language ma
jority families in an American setting.
Considering all four illustrations, what are the essential features of this
"deeper type" of process research or this "Vygotsky style" search for underlying
processes? I see two important features: (1) all such examples are applications
of a hypothetical-deductive research model (cf. Underwood 1957; or Hull 1952)
that makes active use of "hypothetical constructs" or "intervening variables"
(see MacCorquodale and Meehl 1948). These hypothetical constructs are often
simply sophisticated guesses on the part of the researcher. Their importance lies
in the fact that they can be linked, through experiments, with particular input
variables (also known as "independent variables") that are systematically related
to one or several output variables ("dependent variables"). (2) The basic model
also implies multiple hypothetical deductions and testings of the central con
struct, and thus there is an implied requirement that the researcher-theoretician
strive for "construct validity" so as to enhance the believability of the basic con
struct (see Cronbach and Meehl 1955; Underwood 1957: 117ff). This old, de
pendable model gets new names and new twists from time to time, but never any
substantive changes. And as is apparent in the examples, the constructs or basic
processes can be psychological in nature, group or community oriented, and
even culture oriented. This suggested model does imply, however, that valuable
research on foreign or second language learning requires much more than lin
guistic or pedagogical training and interest; it requires as well some extensive
experience and interest in one or more of the behavioral sciences, either on the
part of the researcher or on the part of research collaborators. The important
message, however, is that progress in FL or SL research calls for prime attention
on underlying hypothetical constructs or, more simply, on educated guesses that
experienced teachers are so competent at generating. Progress also calls for
careful and systematic testing using product-type, quantitative research ap
proaches which incorporate as much process data as is economically feasible.
The smaller the scale of the design and the more local its scope, the greater the
progress is likely to be.
Acknowledgement
This paper was also presented at the Conference on Foreign Language Ac
quisition Research and the Classroom, University of Pennsylvania, October 12-
15,1989.
Notes
1. This point is important and worth documenting. Recently, Don Taylor and I conducted a
community study in Detroit, Michigan (see Lambert and Taylor 1988) wherein we worked
with two large school districts over a 3 year period. The superintendent of one district be-
came our good friend. He was a Polish-American and gave our project his personal backing.
He was happy that he had some 15 teachers hired to teach "bilingually" in such languages as
Arabic, Polish, Albania, Greek, and Vietnamese. Watching these teachers in actions, Taylor
and I noted that none of them used any other language than English except for rare special
moments when the other was used with a particular child, and in a soft voice. On hearing
about this the superintendent called all the teachers together to confirm the fact and to hear
the reasons why: e.g. some directive from the Office of Education for the State of Michigan
had sent a directive that this was the way to do bilingual teaching. The directive clearly bo-
thered the Arab and Greek teachers but seemed normal and sensible to most of the Polish
teachers. Another more recent example: Taylor and I, continuing the same project in the
Dade County (Florida) Public School System, visited a 1 hour "bilingual" class in science for
high school pupils. All 35 students were Spanish speaking with varying degrees of skill in
English. The point is that not one word of Spanish was used by the teacher! Someone above
had told her that was what she was to do and she was comfortable with the schema, arguing
that "Since I'm obviously Hispanic myself I know how to get these Hispanic youngsters inter-
ested". She was an excellent teacher, but in no way was she teaching bilingually. God only
knows what the limited-English minority child was getting from that class, and valuable op-
portunities were lost for the fully bilingual children in that class to realize that the same
teacher could have made science both exciting and Spanish at the same time.
2. Incidentally, Swain's finding makes one wonder about "sheltering" the language of instruc-
tion for minority language students, i.e. being too concerned that the inputs are simple and
"comprehensible". Although presently in vogue, I'm more inclined to the pedagogical views
of Sir Walter Scott who, in his 1831 book dedicated to his 5 year old grandson, wrote in the
preface:
"These tales were written... for the use of the young relative to whom they are in-
scribed... The compiler... after commencing his task in a manner obvious to the
most limited capacity... was led to take a different view of the subject, by finding
that a style considerably more elevated was more interesting to his juvenile
reader. There is no harm, but on the contrary there is benefit, in presenting a
child with ideas somewhat beyond his easy and immediate comprehension. The
difficulties thus offered, if not too great or too frequent, stimulate curiosity, and
encourage exertion" (Scott 1831: iii-iv).
70 W.E. LAMBERT
Placing input clearly within versus somewhat beyond the realm of comprehensibility is a
minor point, but one that deserves a series of careful experiments. And we had better not
think too much about the idea that these wonderful stories were enthusiastically read by five
year old's in 1831!
References
Baker, K. and A.A. De Kanter. 1981. Effectiveness of bilingual education: A review of the literature.
Washington, DC: Office of Planning, Budget and Evaluation, U.S Department of Education.
Bryk, A.S. and S.W. Raudenbush. 1987. "Application of hierarchical linear models to assessing
change." Psychological Bulletin 101.147-158.
Campbell, D.T. and R.F. Boruch. 1975. "Making the case for randomized assignment to treat
ments by considering alternatives." Evaluation and experiment ed. by C.A. Bennett and A.A.
Lumsdaine, 195-296. New York: Academic Press.
Chaudron, C. 1988. Second language classrooms: Research on teaching and learning. New York:
Cleghorn, A. and F. Genesee. 1984. "Languages in contact: An ethnographic study of interaction
in an immersion school." TESOL Quarterly 18.595-625.
Cole, M. 1987. Quoted in L.S. Hearnshaw, The shaping of modern psychology, 177. London: Rout-
ledge and Kegan Paul.
Cronbach, L. and P.E. Meehl. 1955. "Construct validity in psychological tests." Psychological Bul-
letin 52.281-302.
Dubé, N.C. and G. Herbert. 1975. The St. John Valley bilingual education project. Washington,
DC: U.S. Department of Health, Education and Welfare.
Fisher, C.W. and L.F. Guthrie. 1981. Executive summary:Thesignificant bilingual instructional fea-
tures study. Document SBIF-83-R.14.
Gardner, R.C. 1981. "Second language learning." A Canadian social psychology of ethnic relations.
ed. by R.C. Gardner and R. Kalin. Toronto: Methuen.
Gardner R.C. and W.E. Lambert. 1972. Attitudes and motivation in second language learning.
Rowley, MA: Newbury House.
Gardner R.C. and P.C. Smythe. 1975. Second language acquisition: A social psychological ap-
proach ( = Research Bulletin, 332.) London/Ontario: University of Western Ontario, Depart
ment of Psychology.
Getzels, J.W. and P.W. Jackson. 1962. Creativity and intelligence. New York: Wiley and Sons.
Glicksman, L. 1981. Improving the prediction of behaviors associated with second language acquisi-
tion. Unpublished doctoral dissertation. London/Ontario: University of Western Ontario.
Hull, C.L. 1952.A behavior system. New Haven: Yale University Press.
Lambert, W.E. 1984. "An overview of issues in immersion education." Studies on immersion edu-
cation ed. by Office of Bilingual Bicultural Education. Sacramento: California State Depart
ment of Education.
Lambert, W.E., H. Giles, and A. Albert. 1976. Language attitudes in a rural community in northern
Maine. Unpublished manuscript. Montreal: Psychology Department, McGill University.
Lambert, W.E., H. Giles, and O. Picard. 1975. "Language attitudes in a French-American com
munity." International Journal of the Sociology of Language 4.127-152.
Long, M. 1984. "Process and product in ESL program evaluation." TESOL Quarterly 18/3.409-
425.
McClennan, H. 1945. Two solitudes, New York: Duell, Sloan and Pearce.
MacCorquodale, K. and P.E. Meehl. 1948. "On a distinction between hypothetical constructs and
intervening variables." Psychological Review 55.95-107.
Peal, E. and W.E. Lambert. 1962. "The relation of bilingualism to intelligence." Psychological
Monographs 76.1-23.
Ramirez, D., S.D. Yuen and D.S. Ramey. 1988. Longitudinal study of immersion, early-exit and
late-exit transitional bilingual education programs for language minority children: Study design
overview. San Mateo, CA: Aguirre International.
Rogosa, D., D. Brandt and M. Zimowski. 1982. "A growth curve approach to the measurement of
change." Psychological Bulletin 92.726-748.
Scott, S. 1973. The relation of divergent thinking to bilingualism: Cause or effect? Unpublished
manuscript. McGill University, Psychology Department.
Scriven, M. 1967. "The methodology of evaluation." Perspectives on curriculum evaluation ( =
American Educational Research Association: Monograph Series on Curriculum Evaluation, 1)
ed. by R.W. Tyler, R.M. Gagné and M. Scriven, 39-83. Chicago: Rand McNally.
Sowell, T. 1983. The economics and politics of race. New York: Morrow and Co.
Swain, M. 1987. Personal communication. See also Harley, B., et al. 1987. The development of bil-
ingual proficiency: Final report. Toronto: Modern Language Center, OISE.
Tikunoff, W.J. 1980. Overview of the significant bilingual instructional features study. San Francisco:
Far West Laboratory, Document SBIF-80-D.1.1.
Tikunoff, WJ. 1981. Significant bilingual instructional features study: A report of the state-of-the-
study. San Francisco: Far West Laboratory, Document SBIF-81-R.8.
Underwood, B.J. 1951. Psychological research. New York: Appleton-Century-Crofts, Inc.
Vygotsky, L. 1962. Thought and language. Cambridge, MA: MIT Press.
Willett, J.B. 1988. Questions and answers in the measurement of change (= Review of Research in
Education, 15.) In press.
Willig, A.C. 1985. "A meta-analysis of selected studies on the effectiveness of bilingual educa
tion." Review of Educational Research 55.269-317.
Ask a Stupid Question...:
Testing Language Proficiency in the Context of
Research Studies
Christine Klein-Braley
The context of discussion here is that of research into foreign language

teaching, specifically into the efficiency of language learning in instructional set
tings. There are a large number of variables and factors which are probably im
portant in this context; indeed the main problem in research in this area seems
to be the complexity of the whole process which makes it extremely difficult to
set up research which will enable the effect of any one variable to be pinned
down. Nevertheless one basic question is likely to be asked in any study: how
well has the language in question been learned and how effectively can it be
used? Whether the planned investigation involves comparisons of teaching
methods, or achievement levels, or differential motivation in the learner, or the
effect of cultural background or the native language on the learning process, in
all cases some way of measuring general language proficiency will be required.
There are a number of ways available for obtaining information in the con
text of a research study. Some are impressionistic, some informal, some semi-
objective, some objective. Wherever possible in a project, however, objective
procedures should be used rather than other techniques and this generally
means tests. Tests are "systematic procedure[s] for observing behavior and de
scribing it with the aid of numerical scales or fixed categories" (Cronbach 1984:
26), and they are based on a well-developed theory of measurement, which
means that their qualities as measuring instruments can be easily evaluated.
Tests in educational settings have numerous purposes. For instance they can
be used to inform learners of their standing with regard to other learners or to
the teaching goal. They can provide teachers with feedback about their own per-
74 CHRISTINE KLEIN-BRALEY
formance or that of the learners. Outside the classroom and divorced from any
course of study, tests can be used as qualifying procedures, for instance by
universities to ensure that foreign students know enough of the language to en
able them to follow lectures in their chosen area, or by aviation authorities to
check whether pilots and air traffic controllers can use English to communicate
with each other successfully.
Ideally it would be possible to construct a test which would — simultaneous
ly—answer all these questions, and any others we might happen to have. Unfor
tunately there is no such animal as the diagnostic achievement aptitude test of
general language proficiency. Tests must be carefully designed to answer specific
questions, and often the more suitable a test is for one purpose the less suitable
it is for another. For instance, a test of achievement (i.e. a test designed to pro
vide feedback on learning progress for teacher and pupil) would begin by sur
veying the units to be taught in a given period of time and the test would consist
of a representative sample of these units weighted according to their relative im
portance. If both teaching and learning had been maximally effective all pupils
would score 100%. It is obvious that a test of this kind is not suitable as a qualifi
cation procedure since its aim is extremely narrow, and there is no way in which
the test scores can be related to a concept of general language proficiency.
Tests for use in research studies are no exception to the general rule that
tests can only provide answers to the questions they are constructed to investi
gate. They need to have special qualities, and this paper will be devoted to sum
marizing their most important and desirable characteristics.
1 Preliminary considerations
1.1 Test development and piloting
Serendipity is a rare phenomenon and not one which can be deliberately

planned into an experiment. Therefore questions of testing must be considered
right from the earliest planning stages since tests need to be properly and pro
fessionally developed. This can mean several cycles of item development, pre
testing, test administration to a sample group, item analysis and further item
development. The final test forms need to be piloted "in the field" so as to en
sure that the instructions, timing, manual, materials, foolproofness are all func
tioning as they should. Procedures for test scoring and data collection need to be
worked out well in advance, and, particularly if they are to involve the help of
teachers, need to be investigated as to their practicability.
TESTING LANGUAGE PROFICIENCY 75
There are many experimental designs which can be used in research studies,
and naturally one will select a design appropriate for the given question. Often
such designs involve repeated testing of the same individuals to determine
whether changes ( = learning!) have occurred in the interval. Sometimes it is
possible to use the same test twice, but it would be preferable to have alternative
versions of the test available for use as required. Statistically, parallel tests are
defined as having the same mean score and the same standard deviation. The
reason for this is obvious: only if the tests are equal in difficulty is it possible to
determine whether a change has taken place between the two test sessions. Con-
ceptually, it seems desirable that the tests should have the same types of items
and numbers of tasks to make them equivalent in language processing tech
niques. Any test format to be used in a research study ought, therefore, to be
one which allows for the relatively effortless production of parallel versions,
which must then be equated by empirical testing.
Sometimes it is desirable to investigate the achievement levels of large
groups and in such cases there is a special technique, matrix sampling, (cf. e.g.
Cronbach 1984: 527) which enables a large pool of items to be split down into a
number of shorter tests so that no one individual needs to respond to all items.
This saves time without serious loss of information.
1.2 Test bias
It is essential for questions of test bias to be considered before the final

form of the test or test battery is decided. No subgroup of the total population
must be disadvantaged by the tests. Thus sex, or race, or social background
should not affect the test results in a systematic way. The test developers must
be aware of this problem and investigate it in the preliminary research.
1.3 Practical considerations
Researchers can choose either to have all tests administered by a small

group of trained personnel, thus keeping a tight control over all testing proce
dures and enabling more complex test forms to be used, or they can delegate the
testing to the classroom teachers. If they do this, then the tests should be as
simple as possible to administer, and if the teachers are expected to do the scor
ing (although this is undesirable for other reasons, see below, Section 2.1.1 Ob
jectivity) then this should not be excessively complex or involve an inordinate
amount of work. It may not be possible or desirable to give the teachers full in-
formation before the tests are administered; for instance if the tests are known
in advance the teachers may well teach for the test, since it is entirely reasonable
for teachers to wish their pupils (and thus indirectly they themselves) to do well.
However all those involved in the study should be offered and provided with full
feedback after the relevant parts of the study have been completed.
Test development should also take into consideration practical constraints
of the school or institutional setting. For instance all tests should be planned to
fit into convenient time slots (e.g. lessons). This might mean using two short
tests rather than one long one. The decision to use media, even such modest
media as cassette recorders or overhead projectors, is an open invitation to
Murphy's Law in rich countries, and may be impossible to implement in poor
countries. The facilities available for any measure which goes beyond traditional
paper and pencil tests (with the researchers providing paper, pencils and pencil
sharpeners!) need to be carefully investigated well in advance before test for
mats are developed which then have to be discarded as too ambitious after the
study has already begun.
For the pupils the testing should be as painless as is reconcilable with the
necessity of collecting the data required. If alternative procedures are possible,
then the one which is shortest, least complicated, most interesting and least like
ly to provoke test anxiety should be selected. Sensitive personal data should only
be collected if this is really essential. So far as possible pupils too should be of
fered feedback about the study in general and their own performance in particu
lar.
The amount of time teachers or educational supervisors are prepared to
allot to the research testing may be much more restricted than the researchers
originally hoped. Other constraints may appear unexpectedly. It therefore makes
sense to consult those responsible inside the school(s) or institutional system(s)
involved at a very early stage of planning.
One reason why some researchers are reluctant to do all this is that the pro
cess of explaining and defending both the study and the planned tests to tea
chers, administrators and outside experts can be both lengthy and painful. Yet
this is an important stage in an investigation. On the one hand the researchers
are forced to formulate their own objectives and methods very clearly in order
to get them across. If they are successful, they will often find that they have not
only gained willing support, but also that the practitioners suggest aspects for in
clusion which might not otherwise have been taken into consideration.
2 Desirable test characteristics
2.1 Psychometric considerations
Like all tests, tests to be used for research purposes need to conform to
basic test standards, that is they should be objective, reliable and valid. Despite
more than 25 years of professional language testing (cf. Carroll 1961; Lado
1961) there still seems to be quite considerable ignorance as to what this means,
even among alleged experts in the field of language testing. One is surprised, for
instance, to find Morrow (1979: 51) claiming "Reliability [in communicative
tests]... will be subordinate to face validity" or Underhill (1987: 105) writing
"Both reliability and validity are rather vague concepts which suffer from a lack
of clear definition about exactly what they are, let alone how they should be as
sessed or calculated". In fact these concepts are quite clearly defined and are
used in exactly the same way as they are used in psychological testing (cf. e.g.
APA 1974), and the qualities to which they refer are intuitively and obviously
desirable since they contribute towards making the tests equitable and fair for
all examinees. Furthermore there is general consensus about assessing how far
tests meet the criteria.
2.1.1 Objectivity
A test is objective if the test score obtained by the examinee is not affected
in any way by the experimenter, proctor or scorer. Objectivity can be affected,
for instance, by some test forms being easier to read than others, by inadequate
or inaudible instructions from the test proctor, but perhaps most obviously and
importantly by scorer bias or subjectivity. A multiple-choice test is machine-
scorable and thus, in this sense, entirely objective; an essay or translation
marked by the pupil's own teacher is very likely to be contaminated by scorer
bias.
It is important to realise that this is not a criticism of the inability of some
(possibly incompetent?) language teachers to disregard personal prejudices in
order to assess their pupils objectively. In medical research into the effective
ness of new drugs it has been found necessary to introduce the double or even
triple blind experiment in order to eliminate experimenter bias. Only if neither
the patient who takes the drug, nor the physician who administers it, nor the la
boratory staff who perform the blood counts (or whatever) are kept ignorant of
whether the patient is receiving the new drug or the placebo is it possible to
evaluate how effective the new medication is. Subjectivity is not a matter of will-
power. In terms of language testing this means that if we decide to use item
types which involve subjective scoring we must be aware that it is essential to

take particular measures to ensure a maximum of objectivity: scorer training to
ensure that scorers confronted with similar scripts will make similar judgments
followed by random assignment of scorers and test scripts over pre- and post test
sessions (cf. Campbell and Stanley 1963:184).
2.1.2 Reliability
A test is reliable if it measures accurately and exactly. This is investigated
heuristically by calculating whether conditions which should be met if the tests
are measuring properly are, in fact, met. Reliability is thus determined by find
ing out (a) whether all the test items correlate with each other (internal consist-
ency); (b) whether the two halves of the test correlate with each other (split-half
reliability); (c) whether the same test administered twice produces scores which
correlate highly with each other (test-retest reliability) and (d) whether two tests
designed to be parallel to each other have high intercorrelations (parallel relia
bility). In fact it is rare for all four types of reliability to be calculated for any one
test although each of them involves a slightly different concept of measurement
stability.
The result of the investigation is a figure between 0 and 1, known as a relia
bility coefficient and technically representing the correlation between the postu
lated "true score" and the observed test score. If the true score and the test
score were identical the correlation between them would be 1.00, but since all
tests are affected by measurement error the difference between 1 and the actual
reliability coefficient is an indicator of the overall accuracy of the test. Obviously
the nearer the reliability coefficient approaches 1, the better the test. If judg
ments of or decisions about individuals are to be made, then testers usually de
mand a reliability coefficient of at least .9; where groups rather than individuals
are being tested the required level is .7.
Tests intended for research purposes are normally tests which aim at
gathering information about groups: the sort of effects we are likely to be look
ing for will probably only reveal themselves in group terms. This means that the
lower reliability level of .7 will usually be acceptable in a research context. It is,
however, important for test reliability to be determined in the preliminary stage
because it is a tragic waste of time, money and resources to embark on large-
scale testing using unreliable tests.
2.1.3 Validity
A test is valid if it measures the thing it is intended to measure. There is no
such thing as an inherently valid test since if a direct indicator of the trait or
quality we wish to measure is available we would use this rather than the test.
Technically it would be possible, for instance, to develop a test which would help
us to decide whether someone is a man or a woman, but nobody has so far taken
the trouble since there are easier ways available of doing this. All tests, there
fore, are indirect ways of trying to get at information which is unavailable direct
ly or immediately — but then, so is a thermometer or a measuring tape. No test is
obviously and of itself valid; validity must always be demonstrated by the test
constructor.
Psychometrics usually defines three types of validity: criterion-referenced,
construct, and content validity.
The easiest type of validity to determine is criterion-referenced, correlative
or empirical validity. The test is valid if the test scores correlate with some other
available measure of the attribute or trait. Often a newer test is developed as a
short cut to acquiring the same information as another, often more complicated,
sometimes merely older, measure. Tests can also have the purpose of predicting
information which will only become available at some time in the future, and
they do this by assuming that a relationship which has been shown to hold be
tween test and criterion for one set of data will continue to operate for further
data sets. Validity in this approach is therefore demonstrated by agreement
(correlation) between criterion and test, and very often the criterion is itself a
test. Agreement between the old, tried and tested method and the new measure
is viewed as confirmation that both test the same thing, i.e. both are valid. This
approach is obviously problematical because it stands and falls with the valida
tion of the criterion or the previous measure. Nevertheless language testers have
made frequent use of this type of validity.
A more complex approach to validation is involved in the concept of con-
struct validation. Here the relationships between a test and the underlying the
ories are examined. The theory predicts lawful relationships and if the test can
show that these relationships do in fact exist, then the test is viewed as valid in
terms of the theory. Sometimes test construction proceeds from the theoretical
assumptions (e.g. the Noise Test or the C-Test) and is tested against them (cf.
e.g. Gaies 1988; Klein-Braley 1985). More often in language testing, construct
validation has begun ex post facto when a new type of test — the most obvious
example is cloze tests — turns out to have interesting and unexpected properties
(cf. Oller 1973 but also Alderson 1979 and Klein-Braley 1981), or because a spe
cific test procedure has become very popular as a criterion measure (cf. Bach-
man and Palmer 1981 on the FSI Interview).
The third type of validity is content validity. Here the universe of interest is
defined and then sampled. This sample is converted in some way into test items
and administered to the examinees. From the performance on the sample con
clusions are drawn about examinee performance in the whole area. For instance
Engels (1982), wishing to sample student knowledge of the 2,000 most frequent
words in English, took random samples of successive 500 word groups (50 items
in each). This test is content valid. Similarly, tests which sample from a defined
curriculum to decide whether students have learned what is being taught use the
concept of content validation. Expert judges are brought in to determine how
well the proposed items represent the universe concerned.
Often, however, the universe is difficult to define, or is infinite, or both.
Language is actually a good example of this. What is language, and what — as
Spolsky (1973) asked — does it mean to know a language? At this point the con
ceptual difference between content and construct validation seems to vanish be
cause the production of any content-valid language test must begin by
constructing a theory of language which can then be used as a basis for sampling.
I shall come back to this point later.
In addition to genuine validity, which must be investigated empirically,
there also exists the concept of face validity. This term is testers' shorthand for
the way a test looks to the naive (i.e. non-expert) user, to the examinee, to the
examinee's friends, parents, relations, even to the teacher who uses tests without
investigation of the assumptions they are based on. Face validity is desirable in
the sense that examinees (and other users) should feel that a test is relevant, ap
propriate and fair. But face validity is in no way sufficient unless the test has
been shown to be valid according to one of the psychometric approaches. And if
test validity has been demonstrated psychometrically then face validity can to a
large extent be ignored. If a lawful and regular statistical relationship could be
shown to hold between students' abilities to lob stones and their subsequent per
formance as simultaneous interpreters, it would be entirely legitimate to take
cohorts of applicants for the United Nations Translation Training Department
to the nearest sports field for a stone-throwing contest, even though the face va
lidity of the selection procedure would be zero.
2.1.4 Test scores and scales

The results of testing can produce nominal-, ordinal- or interval-level data.
Eye colour is a nominal-level variable. People's eyes are either blue or brown or
green. Ranks in the army are ordinally scaled: a corporal is higher than a private,
and a sergeant is higher still. But the units of measurement are of unknown size:
is the difference between private and corporal the same as that between corpo
ral and sergeant? A thermometer, on the other hand, is intervally scaled because
the differences between the units are of equal size. This is why it makes sense to
talk about average temperatures, but makes no sense to talk about average eye
colour or average rank in the army. Complex statistical analysis can only be per
formed on interval level data, and therefore test scores and other data should, if
possible, produce numerical results (rather than ratings or categorisations) since

interval-level data contains more information.
If a test is to be used with the aim of revealing differences, whether between
individuals or groups, it should ideally produce a wide spread of scores with a
small measurement error, i.e. it should be highly reliable and have a large stand
ard deviation. If only three, five, or seven categories are available — as on some
rating scales, for instance — then fine distinctions are impossible. For the same
reason the tests should overall have a medium level of difficulty. If the tests are
too easy ceiling effects can come into operation: there is not enough room on
the scale for the (existing) differences between the good, the very good and the
superb to emerge. Similarly, a test which is too difficult can suffer from "floor"
effects.
One can, of course, also be confronted with exactly the opposite problem: if
there is a very wide range of differences between the students to be tested it is
quite possible that one test will not be able to cover the full range of achieve
ment. Here, one would aim at producing a range of tests with overlapping ele
ments so that they could all be fitted onto one scale. The most obvious way of
doing this is to make use of latent trait approaches (cf. e.g. Henning et al. 1985),
but this again demands interval-level data.
Each test in itself should be unidimensional, i.e. all the items on the test
should measure the same trait or attribute. This can be examined statistically. If
more than one dimension is to be measured then tests can be combined into a
battery.
2.1.5 Other aspects

Other aspects which would play a more important role in tests designed for
other purposes are less important in the context of a research study. Face va
lidity, for instance, is only a minor consideration because the research context it
self confers face validity on the procedure provided the teachers and pupils
involved are promised — and given — full information and feedback after the
study has been completed. In any case "placebo" tests are possible: if it is felt
that participants expect a particular type of test which is not, in fact, envisaged as
part of the study, there is no harm in including a test of this type in the test pack
age but not scoring it. The problem of "teaching for the test" is also one which
seems negligible here: it is unlikely that research measures will become standard
testing procedures, and so the backwash effects of tests on the teaching process
can probably be ignored.
2.2 Linguistic considerations
Any language test is necessarily based on a theory of language, either impli

citly or explicitly, directly or indirectly. Furthermore any test is ultimately based
on the concept of representative sampling of the language as defined by this the
oretical concept, since no language test circumscribes the entire universe of
what we want to test, our aim is always to generalise from test performance to
real-life language use.
In what Spolsky (1981) has called the prescientific or traditional trend of
language testing the theory was implicit and performance-oriented and the tests
used a job-sampling approach. The examinees translated texts into the foreign
language, or wrote essays in it, or answered questions on reading texts, or pro
duced précis, or interpreted Shakespearean dramas—in short, they performed
whatever task the teacher found appropriate. All tasks were obviously viewed as
inherently equivalent, and the examinee who made the fewest mistakes had by
definition made the most progress in learning the language. Since teachers were
the best judges of their students they set, marked and interpreted the tests. No
attention was paid to psychometric criteria, and there were no queries either
about the representativeness of the task (i.e. the sampling) or about possible
item bias. Questions of measuring error did not arise — the teacher's judgment
was law. These tests, although they are still predominant in many educational
systems, obviously have serious deficiencies as measuring instruments.
The language tests produced in Spolsky's second trend — the psychometric-
structuralist or modern tests — were based on an atomistic theory of language
(primarily American structuralism) and they followed Lado's dictum that "the
elements of language... can be profitably studied and described — and tested — as
separate universes" (Lado 1961: 25). In accordance with psychometric theory
the tests consisted of a large number of independent items — multiple measure-
ment reduces test error and increases reliability — and these were often derived
from a test construction matrix of skills (listening, speaking, reading, writing) by
components (grammar, syntax, vocabulary, phonology/orthography etc.) such as
that proposed by Harris (1969). Careful test development using item analysis
techniques ensured high reliabilities, and, moreover, the results of the tests
seemed to agree quite well with teacher judgments, meaning that the tests could
be viewed as correlationally valid.
Psychometrically these tests were fine. But linguistically they seemed less
than satisfactory. On the one hand linguistic theory changed with the Chomskian
revolution, moving from an attitude to language that implied that knowledge of
all the units was the same thing as a knowledge of languagel, towards a view that
language use was a creative rule-governed process. Grave doubts arose as to the
weighting of the individual components and to the adequacy of the sampling,

since it was not possible to show that any one item or set of items were essential
to language proficiency. One of the most important controversies in the lan
guage testing of the seventies and early eighties has been the question of
whether language competence is divisible or unitary. This arose initially from
the extraordinarily high correlations regularly found between subtests in lan
guage test batteries which were intended to tap conceptually independent
dimensions such as grammar, vocabulary, and listening comprehension, for
example. Furthermore it had always been obvious that examinees who could
score high marks on multiple-choice items could not necessarily use their lan
guage knowledge to function adequately in a given situation, and this seemed to
be attributable on the one hand to the testing factor, and on the other to the
presence or absence of real-time constraints.
The tests developed in Spolsky's third trend (the psycholinguistic-socioling-
uistic or post modern) try to retain the positive features of the first two trends
while discarding their negative aspects. The demands of psychometric theory
concerning adequate measurement are generally accepted: tests must be objec
tive, tests must be reliable. The onus of demonstrating test validity is firmly
placed in the hands of the test constructor: anyone propagating a new test or re
viving an old one is expected to lay all the cards on the table, including the lin
guistic theory or theories underlying test construction and full details of the
studies performed to investigate the psychometric qualities of the test(s). So,
while there is a wide variety of different approaches to be found in the field at
the present moment, there is a also a general consensus among language testers
about what can be considered good professional practice. There is also fun
damental agreement that the tests can only be as valid as the theories of lan
guage they are based on.
3 Tests
It seems to me that if we follow current "post-modern" testing practice the

types of tests available can be classified into three broad categories:
- tests of linguistic knowledge;

- tests of linguistic performance;
- tests of communicative performance.
Spolsky (1985) uses the same basic classification, but with a different
nomenclature: structural tests; general language proficiency tests; functional
tests.
3.1 Tests of linguistic knowledge
Tests of linguistic knowledge tend to be discrete-point item tests, and des

pite the (at times rather violent) controversies of the seventies as to the linguis
tic validity of such items (cf. e.g. Oiler 1979), language testers nowadays see no
reason to discard them when they are the best technique for the purpose given.
Particularly in the teaching context it is often desirable to check on learner pro
gress in specific areas, and discrete-point items can be the most efficient Way of
doing this. Thus such tests perform a useful service in the areas of diagnostic and
achievment testing. Discrete-point items and multiple-choice formats can even
have functional dimensions, for instance when the examinee is given four alter
natives, all of which are linguistically correct but only one of which is situation-
ally appropriate.
Construction of tests of this type is not difficult provided the sampling ma
trix (i.e. the theory of language) is available. This can be given, for instance, in
the form of a curriculum or text book or list of items. There are many different
objective item types available, by no means all of them multiple-choice items.
While teachers can gain much useful information from relatively ad hoc
measures, in the context of a research study it would, of course, be necessary to
allow sufficient time for test development, which is laborious rather than theore
tically problematical provided, as I said, that the preliminary spadework of de
fining the universe of elements making up the test domain has already been
done.
If, therefore, all the learners involved in the projected research had been
exposed to the same curriculum, a test of linguistic knowledge could well be a
suitable measuring procedure. Such tests can be entirely objective, they need not
be in multiple-choice format, they are highly reliable as a result of the test devel
opment procedures, and content valid. They are, of course, only valid linguisti
cally to the extent that the curriculum or course of studies itself is linguistically
valid.
Testers are in general less happy about using tests made up of discrete-point
items where there is no sampling matrix available for test construction, because
in this case the test itself — now a proficiency, not an achievement measure — is
the theory of language. Any sample of test items is vulnerable to criticism on lin
guistic grounds ("Why did you choose to test this item of vocabulary or grammar
rather than that one?"). In large-scale testing, on measures such as the TOEFL,
compromises have to be made between the desirable and the feasible. Tests of
this kind must be machine-scorable if they are to be administered to hundreds of
thousands of students every year. But a research project would presumably not
be confronted with enormous numbers of students to be tested. At any rate, be
cause of the problems involved in defining the areas to be sampled, selecting
and writing items, weighting subtests, etc., the theoretical problems in construct
ing a discrete-point proficiency test are immense, and since this task is presum
ably not the aim of the study, this type of test is probably not suitable in this
context.
3.2 Tests of linguistic performance
Tests composed of discrete-point items are based on the assumption that

the testers — or, passing the buck, the textbook authors or curriculum devel
opers — know what knowing a language means. But many testers are far from
certain that they — or indeed the theoretical linguists — do know this. Moreover it
is notorious that knowing a linguistic element does not necessarily involve being
able to use it correctly when attention is not focussed on it as the subject of a
language test item. In addition, what is taught is not necessarily what is learned.
Hence the performance-oriented test. Here the examinees are asked to submit a
sample of their language production or language processing to the tester. In
scoring this, the same very simple assumption is made that we find in traditional
testing: the examinee who produces the "better" sample has learned the lan
guage more effectively.
The theoretical assumptions underlying the (generally holistic and text-
based) tests are now however more explicit. Based on the concept of an inter
nalised grammar, language learning is felt to involve the development of a rule
system in the learner's head. The current state of the rule system is manifested
in the sample obtained by the test. There is no need for the tester to specify
which rules are involved at any point provided that the sample of examinee per
formance obtained is representative and therefore can be generalised. Thus, al
though these tests have a theoretical basis and must accordingly demonstrate
construct validity, the notion of sampling is crucial.
Validation of the tests involves linking up test performance and the under
lying theory. One obvious investigation in this context is whether learners at dif
ferent stages of language learning actually do perform differentially in the tests
(cf. e.g. Gaies 1988; Klein-Braley 1985). An essential proof would consist in
demonstrating similar performance by examinees on more than one test sample.
(I have criticised cloze tests and translation "tests", for instance, because the
question of intercorrelations between two tests from the same "family" adminis
tered to the same subjects has been virtually ignored: Klein-Braley 1983, 1987.)
Relationships can be specified which ought to hold between these tests and
other tests, both of language and other traits, and this can be empirically investi
gated (cf. e.g. Raatz 1985).
What do tests of linguistic performance look like? There are, in fact, two
different groups of tests, subjective and objective.
3.2.1 Subjective tests of linguistic performance

The subjective tests appear on the surface to be identical to the "tradi
tional" language tests: essays, interviews, translations and so on. They are sub
jective in two different senses. In the first place they are subjective because the
examinees select the samples of language they present for assessment. A student
asked to write an essay could probably produce a number of variations both in
content and in language on any one theme. And obviously any sensible student
in a testing situation will put his or her best linguistic foot forward.
This type of task is also subjective in the scoring phase. Research has also
shown that they are also often neither reliable nor valid. French (1961) reported
an investigation where 300 essays were marked by 51 different assessors. No
fewer than 101 essays received all possible 9 grades, and no essay received fewer
than 5 grades. French attributes this to four different sources of error:
(1) Student error: a student can do well one day and poorly on another although
the task remains constant. This raises problems of reliability.
(2) Test error: a test calls for a sample of a student's behaviour. In a discrete-
point item test there are many items (since multiple measurement reduces
error), but an essay must be viewed as a one-item-test. This affects validity.
(3) Scale error: the marker can be easy or tough, and it is very difficult to get all
readers to grade papers on the same scale. This is a question of objectivity.
(4) Reader disagreement: even if the scorers use the same numbers of As, Bs
and Cs etc. they may not assign them in the same way. This lowers reliability and
validity.
The same sources of error are present in interview tests, in translation tests,
in précis or summary tests, etc.
Language testers are aware of these problems and know that the only way
out of the dilemma is control. Control of content, for instance. In a post-modern
essay test the students are likely to be asked to provide three, rather than one,
samples of their language, and are likely to be given very little choice about what
they are to write. So the task set would not be "Discuss alcoholism in around 300
words" but "Using the information given in Tables 1 to 3, discuss the causes and
effects of alcoholism in businessmen and the possibilities of treatment. Write

not more than 300 words". A scoring scheme, possibly with a model answer but
at the very least with definite criteria for assigning plus and minus points, will
have been set up in advance, and the essay will probably be marked by more
than one person. These scorers will have been trained to use the scoring scheme
reliably. A good example of the post-modern approach to traditional testing pro
cedures can be seen in the new TOEFL TWE test.
The use of such doubly subjective procedures in such a way that testers
would accept them as sufficiently objective to be interpretable involves an im
mense amount of work and, because of the staff resources needed, is extremely
expensive. Nevertheless such procedures are essential in certain contexts: if the
teaching objective is writing skills then the test must obviously demand skilled
writing from the examinee — the backwash effect of testing on teaching should
never be underestimated. Whether subjective tests of linguistic performance are
needed in the context of research studies is another matter, since there are also
objective tests available.
3.2.2 Objective tests of linguistic performance

The objective tests of linguistic performance provide the examinee with a
sample of language to be processed in some specified way, so there is no oppor
tunity for examinee avoidance strategies. Furthermore there are correct solu
tions to the individual items on the tests, or at least a limited number of
acceptable solutions. The most important new tests in this group are dictation
(an old test with a new justification!), the Noise Test, cloze tests and the cloze
variations (multiple-choice cloze; rational deletion cloze, e.g. Bachman 1981,
Olshtain and Feuerstein 1988; cloze elide procedures — Manning 1986a, 1986b),
and the C-Test (Klein-Braley and Raatz 1985). All these techniques have been
the subject of very thorough empirical investigation. But objective linguistic per
formance tests also include objective listening and reading comprehension tests
(not necessarily in multiple-choice format) and a surprising number of objective
oral procedures (cf. e.g. Madsen and Jones 1981; Van Weeren 1981).
While the objective procedures eliminate errors due to scale error and
reader disagreement (errors (3) and (4) above) they can be equally subject to
test error (2). No test can eliminate the problem of student error, although it can
be very considerably reduced by multiple testing.
In these tests the text used as the basis of the test, whether it is dictated or
damaged (cloze, C-Tests), or expanded by irrelevant words (cloze elide) or has
questions formulated about it, is not selected because of its intrinsic interest per
se. Its function is that of a language sample, and as such it is regarded as inter
changeable with any other text which could have been used in the test. Because
of possible item bias — qualities in the item which favour some examinees but
disadvantage others - more than one text should be used. This also has advant
ages for subsequent test analysis since most statistical procedures available can
only legitimately be used if items are independent of each other. Thus the tradi
tional statistical analysis of cloze tests, for instance, on the basis of individual
blanks in the text, is not legitimate since the items are embedded in the same
text and are thus dependent on each other. What is possible is analysis on the
super-item level using each text or task as an item. This is the approach adopted
with the C-Test (cf. Klein-Braley and Raatz 1985).
The objective procedures have the advantage that it is reasonably easy to
produce highly reliable tests. The test scores are numerical with a fairly wide
range on an interval scale, whereas the subjective procedures are generally
scored on (ordinal level) rating scales, and rarely use more than 5, or at the most
7, categories.
Like all tests, these tests need to be put through test development proce
dures, but in most cases it is not difficult to develop acceptable tests. The main
exception seems to be the classical nth word deletion cloze test, which has been
shown to be highly erratic in its performance (cf. Alderson 1979; Klein-Braley
1981) and which is difficult to score: if exact scoring is used (= only replace
ment of the original word is counted as correct) then it is often too difficult for
learners of foreign languages (it is often too difficult for mother tongue learners
too! — cf. Klein-Braley 1982), and if acceptable scoring is used then a great deal
of time can be spent in agreeing on what is acceptable, which casts away all the
advantages of an objective procedure.
In the context of a research study the objective tests of linguistic perfor
mance would be the ones to look at first. They are relatively easy to produce,
fairly easy to explain to the test takers, and the scoring is objective, though up to
now it cannot be performed by machine — with the exception of the cloze elide
test where ETS holds a patent for machine-scorable forms.
At the same time it should be pointed out that these are proficiency, not
achievement tests. They are not curriculum-oriented. Their purpose is to place
learners on a continuum from zero to 100% linguistic performance. They are
not designed to reveal small increments of linguistic knowledge, the control of
individual units, the ability to manipulate specific structural rules. Nor are they
diagnostic tests. This means that normally they have no justification as classroom
tests — since in my opinion learners have a right to expect that tests administered
as part of the learning process should in some way be related to what has been
taught. In the context of a research study, on the other hand, their absence from
the normal classroom can probably be viewed as a benefit, since they will be un-
usual and interesting measures whose face validity is self-evident as a result of

the research context.
3.3 Tests of communicative performance
The difference between tests of linguistic performance and those of com

municative performance is precisely the same distinction as that made between
linguistic and communicative competence. Using language communicatively
means using the appropriate language in a given situation/Nowadays it seems to
be generally understood that linguistic competence forms the prerequisite for
communicative competence, i.e. that the language elements used in communica
tive encounters must be correct in both senses.
Just as there is no genuinely valid test, there is also no genuinely communi
cative test. To communicate is to use language to fulfil a need. To test involves
persuading someone to simulate using language as though a need were to be ful
filled. And it then involves assessing the performance put on by the examinee.
So in an oral test we may say, "You have just bought a radio and when you get it
home it turns out to be defective. So you go back to the shop. What do you say to
the manager?" Both the test and the genuine situation could be regarded as
stressful, but they are stressful in entirely different ways: to achieve a satisfactory
outcome in the real situation the buyer has to persuade the manager to replace
the radio, whereas in the test situation the student has to convince the tester that
he or she is able to cope verbally with the function of "complaining". Moreover,
strategies effective in the real world such as shouting so that other customers
hear what is going on, bursting into tears, taking along one's big brother, are not
considered kosher in tests. So no test is genuinely communicative: a test can, at
best, be quasi- or pseudo-communicative.
This does not mean that tests cannot be made more meaningful, more like
real life, more intrinsically interesting. They can, and this is in my view one of
the main advantages of the communicative approach to language teaching: not
that people seem to learn the language any better (in terms of accuracy it seems
very often that they learn it less well, which is a worry to those involved in very
high level language teaching), but they do at least seem to enjoy learning lan
guages more if the linguistic tasks they have to perform seem more relevant to
everyday life. The same can apply to testing.
It should be realized however that this approach to testing, carried to its
logical conclusion, has more pitfalls than are generally recognised. Tasks should
have the appearance of authenticity. So we have to rethink our definitions. What
is listening comprehension? When precisely are we deprived of all other sensory
channels for communication? I can only come up with the telephone, the radio
and the station/airport loud speakers. Fair enough. But then we realize that we
also have to throw out all multiple-choice measures — and this leaves us in rather
a quandary so far as testing reading and listening is concerned. What is the aver
age everyday response to reading a book, or to listening to a radio programme?
Normally there is no visible response at all! Admittedly we could get round this
problem by using specialised materials: a comedy programme, perhaps, and
counting the laughs. But this is (a) unsatisfactory sampling and (b) may be af
fected not by the examinee's level of language proficiency but by his or her sense
of humour. Similarly essays go overboard. The only people who regularly write
essays as part of everyday life are schoolchildren and language students. Their
mother and fathers don't. They write letters, shopping lists, notes for the clean
ing lady — and possibly a variety of texts in their professional capacities. But they
don't write essays.
A second problem with communicative tests is that of judging the outcome
of the test procedures. What is to be judged? The adequacy with which the stu
dent performs the given task? But the task itself, as we have seen, is only quasi-
authentic. And just how is the language used in performing the task to be
assessed? Adequacy? Fluency? Correctness? Amount of foreign accent? Olsh-
tain and Blum-Kulka (1985: 28) make the following suggestion:
"Since one of the outstanding features of speech act behavior is variability, ex
pected outcomes on the test given to learners of the language will need to
allow for such variability. The tester needs to relate, therefore, to a range of
acceptable answers. How can this range be established? One possible remedy
might be to follow the principle of administering any functional test to native
speakers of the target language first, in order to establish the acceptable vari
ation of answers. Accordingly, the tester will be able to evaluate the learner's
answers by comparing them to the native norms of variability on the same test
and within the very same testing item".
There is obviously a great deal of spadework to be put in before tests are

ready for use.
A third problem is that of adequate sampling of different areas of language
performance. It is rarely of interest whether the specific task forming part of the
test can be performed; this task is rather to be viewed as a sample enabling
generalisations to be made about the examinee's performance on other tasks.
Very little is known as yet about the transfer between different functions or no
tions: it might seem logical that transfer should take place, but in the testing
context empirical confirmation of the inspired hunch is required.
I feel it is important to stress these points because the proponents of com

municative testing seem only interested in certain aspects of "authenticity",
generally the question of the type of texts involved, and not in others. If we look
at tests which are currently labelled "communicative" we will see that there is a
great deal of inconsistency and woolly thinking in this area (cf. the criticisms of
Skehan 1984). Morrow (1979) has accused language testers of lagging behind
teaching methodology and of failing to produce communicative tests. In view of
the theoretical problems involved and the research that needs to be done before
a satisfactory test can be developed, this is not really surprising.
One area of communicative testing where quite a lot of progress has been
made is that of developing tests aimed at assessing the examinee's ability to per
form job-related tasks: examinees are expected to demonstrate that they can
cope linguistically with tasks and situations of the type likely to be met in a
known and specified professional or educational context (cf. e.g. Hauptman et
al. 1985; Rea 1988, but also the criticism from Skehan 1984). These tests are too
specific for use with generalised classroom groups, though this approach could
be used in a research study if the context is one in which the foreign language is
being learned for clearly defined purposes.
4 Conclusions
In any research study it will probably be desirable to collect various differ

ent types of data. Nevertheless, wherever possible objective measures should be
used, and I have tried to show that in the language testing area there are a var
iety of techniques available to researchers which will enable them to develop
tests and items tailored to the questions being asked in the context of the study.
"Outside measures" such as the TOEFL, the British Council Tests, the Cam
bridge Certificates etc., despite their prestige and the painstaking research that
has been invested in them, are often too unspecific to be of use. What is import
ant is that the testing should be discussed right at the beginning of the planning
stage, not only because preliminary research ( = test development) needs to
start well in advance of the actual study itself, but also because questions can
only be answered if they are put in the right way. And this means that the tests
must work properly.
I have tried to show in the context of this contribution what language testers
understand by "working properly". Tests must conform to basic standards both
psychometrically and linguistically. This always involves a great deal of work, be
cause although there are tests which are relatively easy to construct (essays for
example) the scoring is a nightmare if it is to be done properly. On the other
hand the discrete-point-item test can need several cycles of painstaking test de
velopment, but it can subsequently be used with very large groups and adminis
tered and scored by ancillary personnel because the test development
procedures have made it relatively foolproof. My own preference — speaking
now as a language teacher - would be to invest the effort in test development.
But then I hate marking student papers!
It may seem that in focussing so much attention on the tests I am implying
that much of the effort — and funding — going into the research project needs to
be invested in the tests. This is, in fact, exactly what I am suggesting since in any
piece of research satisfactory, i.e. interpretable and reliable, results can only be
obtained if the measurement procedures are functioning properly. No amount
of effort or statistical manipulation can rescue a research study if the tests have
been designed to answer the wrong questions or if they are not sufficiently sensi
tive to detect possible effects.
Acknowledgement
Thanks, as always, are due to my research partner, colleague and friend Ulrich Raatz, Professor
of Clinical Psychology at the University of Duisburg, for his helpful comments and criticism. All
remaining errors are entirely my own work
References
Alderson, J. Charles. 1979. "The cloze procedure and proficiency in English as a foreign lan-
guage." TESOL Quarterly 11.59-67.
APA: American Psychological Association. 1974. Standards for educational and psychological
tests. Washington: APA.
Bachman, Lyle F. and Adrian S. Palmer. 1981. "The construct validation of the FSI Oral Inter-
view." Language Learning 31/1.67-86.
Bachman, Lyle F. 1981. The trait structure of cloze test scores. Paper presented at the 1981 TESOL
Midwest Regional Conference, Champaign/Urbana.
Campbell, Donald T. and Julian C. Stanley. 1963. "Experimental and quasi-experimental designs
for research on teaching." Handbook of research in teaching ed. by N.L. Gage, 171-246. Chi-
cago: Rand McNally and Co.
Carroll, John B. 1961. "Fundamental considerations in testing for English language proficiency of
foreign students." Testing the English proficiency of foreign students ed. by Center for Applied
Linguistics, 30-40. Washington, DC: Center for Applied Linguistics.
Cronbach, Lee J. 1984. Essentials ofpsychological testing. New York: Harper and Row.
Engels, Leopold K. 1982. "Testing and mastery learning of English vocabulary at university level."
Practice and problems in language testing III. Studiereeks van het tijdschrift van de Vrije
Universiteit Brussel, 10 ed. by Madeline Lutjeharms and Terry Culhane, 144-157. Brussel:
VUB.
French, John W. 1961. "Schools of thought in judging excellence of English themes." Testing prob-
lems in perspective ed. by Anne Anastasi, 587-596. Washington, DC: American Council on
Education.
Gaies, Stephen J. 1988. "Validation of the Noise Test." In Grotjahn, Klein-Braley and Stevenson
1988.41-74.
Grotjahn, Rüdiger, Christine Klein-Braley and Douglas K. Stevenson, eds. 1988. Taking their
measure: the validity and validation of language tests ( = Quantitative Linguistics, 34.) Bo-
chum: Studienverlag Dr. N. Brockmeyer.
Harris, David P. 1969. Testing English as a second language. New York: McGraw-Hill.
Hauptman, Philip C , R. LeBlanc and M. Bingham Wesche, eds. 1985. Second language perfor-
mance testing. Ottawa: University of Ottawa Press.
Henning, Grant, Hudson, G. and Turner, J. 1985. "Item response theory and the assumption of
unidimensionality for language tests." Language Testing 2.141-154.
Klein-Braley, Christine and Ulrich Raatz, eds. 1985. C-Tests in der Praxis. Bochum: Fremdsprache
und Hochschule: AKS Rundbrief 13/14.
Klein-Braley, Christine. 1981. Empirical investigations of cloze tests. Ph. D. Dissertation, University
of Duisburg.
Klein-Braley, Christine. 1982. "On the suitability of cloze tests as measures of reading comprehen-
sion." Lezen in Onderwijs en Onderzoek ( = Toegepaste taalwetenschap in artikelen, 13) ed. by
A.J.M. van der Geest, C.J.Koster and J.F. Matter, 49-61. Amsterdam: VU Boekhandels.
Klein-Braley, Christine. 1983. "A cloze is a cloze is a question." Issues in language testing research
ed. by John W. Oiler Jr., 218-228. Rowley, MA: Newbury House.
Klein-Braley, Christine. 1985. "A cloze-up on the C-Test." Language Testing 2.76-104.
Klein-Braley, Christine. 1987. "Fossil at large: translation as a language testing procedure." Grot-
jahn, Klein-Braley and Stevenson 1988.111-132.
Lado, Robert. 1961. Language Testing. London: Longman.
Madsen, Harold S. and Randall L. Jones. 1981. "Classification of oral proficiency tests." The con-
struct validation of tests of communicative competence ed. by Adrian S. Palmer, Peter J.M.
Groot and George A. Trosper, 15-30. Washington, DC: TESOL.
Manning, Winton H. 1986a. "Using technology to assess second language proficiency through
Cloze-Elide tests." Technology and language testing ed. by Charles W. Stansfield, 147-166.
Washington, DC: TESOL.
Manning, Winton H. 1986b. Development of Cloze-Elide tests of English as a second language.
Draft final report submitted to the TOEFL Research Committee. Princeton, NJ: Educa-
tional Testing Service.
Morrow, Keith. 1979. "Communicative language testing: revolution or evolution?" The communi-
cative approach to language teaching ed. by Christopher J. Brumfit and Keith Johnson, 143-
157. Oxford: Oxford University Press.
Oiler, John W. Jr. 1973. "Cloze tests of second language proficiency and what they measure."
Language Learning 23.105-118.
Oiler, John W. Jr. 1979. Language tests at school. London: Longman.
Olshtain, Elite and Shoshana Blum-Kulka. 1985. "Crosscultural pragmatics and the testing of
communicative competence." Language Testing 2/1.16-30.
Olshtain, Elite and Tamar Feuerstein. 1988. "Computer assisted global textual analysis". Paper
presented at the 13th International LAUD Symposium on Linguistic Approaches to Artifi-
cial Intelligence, Duisburg.
Raatz, Ulrich. 1985. "The factorial validity of C-Tests." Klein-Braley and Raatz 1985.42-54.
Rea, Pauline M. 1988. "Testing doctors' written communicative competence: an experimental

technique in English for specialist purposes." Grotjahn, Klein-Braley and Stevenson
1988.185-218.
Skehan, Peter. 1984. "Issues in the testing of English for specific purposes." Language Testing
1/2.202-220.
Spolsky, Bernard. 1973. "What does it mean to know a language; or how do you get somebody to
perform his competence?" Focus on the learner ed. by John W. Oiler Jr. and Jack Ri-
chards.164-176. Rowley, MA: Newbury House.
Spolsky, Bernard. 1981. "Some ethical questions about language testing." Practice and problems in
language testing I ed. by Christine Klein-Braley and Douglas K. Stevenson, 5-30. Frankfurt:
Verlag Peter D. Lang.
Spolsky, Bernard. 1985. "What does it mean to know how to use a language? An essay on the the-
oretical basis of language testing." Language Testing 2/2.180-191.
Underhill, Nic. 1987. Testing spoken language. A handbook of oral testing techniques. Cambridge:
Van Weeren, Jan. 1981. "Testing oral proficiency in everyday situations." Practice and problems in
language testing I ed. by Christine Klein-Braley and Douglas K. Stevenson, 96-124. Frankfurt:
Verlag Peter D. Lang.
Item Response Theory and Reduced Redundancy
Techniques: Some Notes on Recent Developments in
Language Testing
Mats Oscarson
In the heyday of comparative investigations into language teaching metho

dology, some 20 years ago, researchers tended to be a little over-optimistic as re
gards the possibilities of disclosing "true" differences in attainment resulting
from different experimental treatments. The statistical methods used for analys
ing test data were not always very sophisticated and the measurement instru
ments employed were sometimes of a rather crude nature. The intention of the
present paper, which falls into two parts, is to comment on some recent develop
ments in these two fields, i.e. those of item data analysis and test construction. A
powerful analytic model, the so-called item response theory, which is now com
ing into use in language testing, is briefly reviewed, and some advances in the
development of testing techniques, particularly the cloze procedure, are indi
cated. It is suggested that developments such as these will improve the prospects
for successful research into the relative effectiveness of contrasting language
learning strategies, which is one of the topics at this conference.
1 Theories of Testing
1.1 Classical Test Theory
The paramount concern of language testers, as of all testers, is the question

of how to identify reliable and valid test items. The steps involved in the quest
96 MATS OSCARSON
for such items typically include pre-testing of a pilot version of a set of items (a
test), checking of the psychometric properties of the test as a whole and of the
individual items, and selection of satisfactory items for inclusion in the final pro
duct. In classical test theory the focus of empirical checks is on aspects such as
ascertaining suitable difficulty levels and appropriate item discriminative power,
and establishing test reliability and validity properties, usually by means of
correlational methods (resulting in internal consistency measures and inter-
correlations). Not infrequently, statistics calculated according to classical theory
carry meaning only in relation to a given sample of persons who have taken the
test in question and in relation to the particular set of items included in the test.
In other words, the estimates are relative to sampling characteristics. They are
sample-dependent both in respect of the sample of persons involved and in re
spect of the sample of items used. There is no way of quantifying the test statis
tics in objective or absolute terms, and this must be considered a weakness in
classical theory.
1,2 Item Response Theory
More recently, classical techniques for assessing the statistical properties of

test items have been supplemented by latent trait measurement theory, or item
response theory (IRT). This is a technique, or family of techniques, which has
been developed over the last two or three decades, mainly by psychometricians
active in the behavioural sciences. In the last ten years the theory has also at
tracted increasing attention in the field of language testing and very interesting
research results have begun to appear. The major strength of the theory is that it
offers a solution to the problem of lack of generality of item statistics in classical
measurement theory (touched on above). Briefly, the theory enables the tester
to estimate ability parameters independently of the particular configuration of
items used in the test (= "item-free" or "non-item-specific" person ability
measurements) as well as to estimate item difficulties independently of the
ability structure in the group of subjects being tested ( = "person-free" or "non-
person-specific" item difficulty measurements). In the following sections of the
paper, I will try to describe and assess the potential of these advances in the con
text of the research we are discussing here and I will start by reviewing some of
the claims that have been made for the theory (cf. for instance Henning 1987).
To begin with, latent trait theory offers, as already indicated, the consider
able advantage of "person-free" assessment of item characteristics. This implies,
for instance, that estimates of the relative difficulty of items are made in such a
way that they may be regarded as invariant over different ability levels (i.e. over
ITEM RESPONSE THEORY AND REDUCED REDUNDANCY TECHNIQUES 97
different groups of testees). The theory also enables us to compare abilities of

persons taking different clusters of items provided, of course, that these clusters
are drawn from a common bank of calibrated items. There are several important
benefits associated with these circumstances, some of which will be discussed
later in this paper.
Secondly, application of latent trait models in testing facilitates the con
struction of more accurate and more sensitive test intruments, owing to the fact
that it enables the researcher to detect weak ("misfitting") items more easily (cf.
research in the testing of foreign language reading comprehension reported by
Perkins and Miller 1984, and by Henning 1984). A likely consequence of this im
provement in analytic procedures is that the number of items in a test may be re
duced without an attendant loss of measurement precision.
Thirdly, latent trait measurement allows us, in principle, to analyze and
determine the construct, or trait, which underlies performance on any given test.
In other words, latent trait theory provides us with a possible tool for analyzing
what the tests measure. Research in this area has been conducted by, for in
stance, de Jong and Glas (1987), who concluded, on the basis of analyses of em
pirical data, that foreign language listening comprehension tasks which require
"literal understanding" discriminate better between native speakers and non-na
tive learners than do items which tap the ability to interpret meanings beyond
the literal level. (See also research carried out by Willmott and Fowles 1974,
Perkins and Miller 1984).
Below I will exemplify the general principles of latent trait measurement
theory by referring to the most commonly used model, i.e. the one developed by
the Danish mathematician Georg Rasch. As for basic factual information, I will
be drawing, to a large extent, on articles published in recent issues of Language
Testing and on reports by Gustafsson (1980, 1981). For further details the
reader is also referred to Rasch (1960) and Wright and Stone (1979).
1.3 The Rasch Model
The one-parameter Rasch model is the simplest of the latent trait models
and it is the one that has been most commonly used in language testing as well
as in many other disciplines. It uses only one parameter to describe each item
("difficulty") and only one parameter to describe each person ("ability"), hence
the designation. Further, the model states that the probability of a correct re
sponse to an item is a simple logistic function of these two parameters. By use of
the model it is possible (under certain assumptions, see below) to predict the li
kelihood of a correct answer to a given test item on the basis of knowledge of
98 MATS OSCARSON
only two variables, item difficulty and person ability. The mathematical function
that relates the probability of a correct item response to the ability variable (the
latent trait) is described graphically by a so-called item characteristic curve
(ICC), which typically takes the form of an elongated S (see Figure). On the
basis of the item characteristic curve it is possible to make an estimate of the
probability (p) of a correct response to the item at any given student ability
level. For example, for a person at ability level -2 (representing an independent
assessment on a transformed z-scale) the probability of responding correctly to
item j is .4 (i.e. there is a 40% chance that the person will obtain a correct score).
For a person at ability level 1 the probability of success on item k is .7. The value
of p depends of course, as always, on person ability and item difficulty.
The response pattern observed for a given item can be tested statistically for
goodness-of-fit. For an item that does not fit the model, the item characteristic
curve will deviate more or less markedly from the pattern portrayed in the fig
ure.
Figure. Typical item characteristic curves (for items j and k).
A practical implication of the Rasch approach to item analysis is that the

parameters of item difficulty and person ability may be conveniently expressed
on a common (one-dimensional) continuum. In classical test theory they are
necessarily defined on separate scales. (For a brief sketch of the mathematical
procedures involved in deriving the Rasch scale, see for instance Woods and
Baker 1985.)
Rasch analysis is applicable both to tasks which are scored right or wrong
(i.e. dichotomously) and tasks which are rated on a scale. The latter facility is an
extension of the original theory and is usually referred to as the Partial Credit
method (described in detail in Wright and Masters 1982). A very clear and in
structive demonstration of the usefulness of this form of the Rasch model is
given by Pollitt and Hutchinson (1987), who employed it in the analysis and cali
bration of a number of free writing tasks.
The Rasch model (like any other latent trait model) is based on a number
of assumptions concerning the nature of the data under analysis. The most im
portant of these are (1) the assumption of unidimensionality, which means that
the test must be homogeneous, i.e. each item in the test must measure the same
characteristic and (2) the assumption of local stochastic independence, which
means that performance on one item must not be affected by performance on
other items in the test. (There are, in addition, certain other requirements asso
ciated with the use of the model, but I will not go into them in this context.)
From the assumption of unidimensionality it follows that poor fit to the
model will be obtained if the test is heterogeneous, for instance if some of the
items measure the ability aimed at and little or nothing else, whereas the others
measure a slightly different ability, or the intended ability plus something else.
This is not uncommon in foreign language listening comprehension testing, for
example, where a less relevant variable such as the ability to remember detailed
information may easily become over-represented among the more important
components of the skill one wants to assess.
It is usually important to be able to ascertain unidimensionality of test
measures in the context of measurement of the relative effects of different in
structional treatments, for instance when the same post-test is used to gauge in
structional effects. The reason for this is that there is always a risk that there may
be interactions between treatments and test scores resulting from covariation of
item difficulties and one (but not the other) treatment. What is needed in such
cases is a homogeneous test which measures the same thing in all treatments.
There is probably a good chance of avoiding this potential source of error if
Rasch analysis of test item data is undertaken.
Generally speaking, violation of underlying assumptions will of course en
danger the validity of results obtained by use of the model. However, the degree
to which one may accept departures from the assumptions is sometimes a matter
of practical judgement. Gustafsson (1980: 226) states that "... the fit of the data
to the model is important but the question of fit is nevertheless subordinate to
the solution of concrete measurement problems. This implies that lower stand
ards of fit can sometimes be set, that all possible deviations from the model as
sumptions need not necessarily be considered and that in fact large deviations in
the data from the model assumptions can sometimes be tolerated". Further
more, the assumptions under which one may apply the Rasch model are largely
100 MATS OSCARSON
the same as those which make application of classical test theory permissible, as
when a pointbiserial correlation is calculated in order to ascertain item discrimi
nation power or when a reliability index is calculated in order to estimate the in
ternal consistency of a set of test items.
As heterogeneity of items violates the assumption of unidimensionality, a
test should be checked for this before the Rasch model is employed, for instance
by use of factor analysis. However, application of factor analysis to dichotomous
data is considered to be problematic (Hambleton and Swaminathan 1985: 156).
In a collection of papers edited by Hughes and Porter (1983) it is pointed out
that the exploratory use of factor analysis tends to result in over-estimation of
the magnitude of the first factor, i.e. of the common variance. Instead, increasing
attention is now being paid to confirmatory approaches, i.e. to methods which
allow the researcher to make a statistical comparison between the predictions of
a model and the results obtained empirically (Palmer and Bachman 1981;
Adams, Griffin and Martin 1987).
Finally a note on the interpretation of the notion of unidimensionality. By
means of the latent trait approach to item analysis we reject items which do not
fit the model (which, by the way, only means that those items do not function
well in combination with the other items in the particular test in hand, not that
they are necessarily poor items in a different context). This operation would
seem to constitute a potential risk with respect to test validity. After we have dis
carded misfitting items, it would appear that we will be left with a test instru
ment which meaures a very narrow range of abilities, or indeed just one single
refined ability in accordance with the fundamental requirement of the model,
i.e. that of unidimensionality of scores. The following question might then be
raised, at least in the area of language testing: Doesn't this result in a loss of va
lidity? After all, linguistic competence is a highly complex attribute, even when
we restrict our attention to sub-skills such as comprehension of spoken lan
guage, command of grammar, lexical control etc. Gustafsson (1977: 88) offers a
solution to this perplexing issue by stating that even if only one single variable is
measured with the same test "it does not mean that the latent trait in itself is
unidimensional; it may well be functionally (and factorially) complex and we can
certainly not claim that there is one unitary process underlying test perfor
mance". The view is supported empirically by Henning, Hudson and Turner
(1985), who studied the problem using a 150-item multi-skill language profi
ciency test. Examination of test data (item fit statistics etc.) indicated no viol
ations of the assumption of unidimensionality even though the test consisted of
subtests measuring such diverse skills as listening and reading comprehension,
grammar accuracy, vocabulary recognition, and writing error detection. (Re
search in this area, involving foreign language reading comprehension, has also
been reported by Willmott and Fowles 1974.) Thus it is probably safe to say that
the apparent threat to test validity posed by the assumption of unidimensionality
is not a real one and that there is no conflict between the requirements of
"coverage of domains" and measurement in one dimension.
1.4Application of the Rasch Model
Application of the Rasch model provides the researcher with information

on how to organise the test items in terms of level of difficulty, spread of item
difficulty, test length etc. in order to obtain optimal precision of measurement.
This may be viewed as the general and primary function of the model. The fol
lowing particular areas of application follow, directly or indirectly, from this
general description.
1.5 Test Equation
By means of test linking or test equation using IRT techniques one may
compare scores obtained on different tests (including scores obtained on tests at
different levels of difficulty, so-called vertical equation). Basically, the compari
son is made possible by use of a set of link items common to both tests, or any
pair of tests if more than two are being calibrated (see Wright and Stone 1979
for a full description). This type of application of latent trait procedures is highly
relevant in research into the effects of instructional treatments, for instance in
long-term longitudinal studies which involve measurement at distant time inter
vals and which may therefore necessarily involve updating of parts of the tests
being used. Another obvious area of application is in classical pre-test/post-test
research designs.
1.6 Item Banking
Another very useful application of the theory is that of creating a pool of

items with known and invariant measurement characteristics. As has already
been pointed out, the model provides estimates of item difficulties which are
meaningful irrespective of ability level tested, i.e. of the particular sample of
persons taking the test, and this affords the conceptual basis for this extension of
the use of the Rasch model. Any set of items drawn from a bank constituted in
accordance with latent trait criteria yields measurements that can be directly re-
102 MATS OSCARSON
lated to those of any other set of items drawn from the same pool. The relevance
of this facility is particularly obvious in situations which require precise assess
ments in relation to some absolute standard of performance.
It should be added that the feasibility and usefulness of item banking on the
basis of latent trait measurement principles has been a matter of some dispute
among psychometricians (see for instance Woods and Baker 1985). The con
troversy relates primarily to the question of whether test characteristics estab
lished by means of the Rasch model can in fact be assumed to remain constant
over time.
1.7 Tailored Testing
By virtue of the fact that the Rasch model makes provision for "person-in
dependent" item calibration, it is possible to minimize the errors of measure
ment inherent in any set of test scores. Theoretically, the condition of minimal
measurement error obtains when all subjects only take items on which the prob
ability of responding correctly is equal to the probability of responding incor
rectly (i.e. when p = .50). Therefore it is always an advantage if one can
administer different sets of items, each at a suitable level of difficulty in terms of
probability of a correct response, to different groups of examinees, rather than
administering the same set of items to all subjects. As already indicated, the
Rasch one-parameter model provides one way of doing just that, i.e. of tailoring
the test to suit the particular target group in hand. The resulting gain in meas
urement precision and cost-effectiveness is of great interest in many educational
research contexts as well as in institutional test-administration programmes.
1.8 Test-Content Bias—A Threat to Comparative Research
As we have seen, the Rasch model can be used for test dimensionality
"check-up". The model rejects items which measure something other than the
majority of items in the test. The practical implication of such a function is of ut
most interest in the type of research we are considering here (and indeed in any
type of research which aims at measuring a specific ability variable by means of a
test); it is imperative that we know that each measurement instrument quantifies
a single defined ability at a time and not a conglomerate of abilities, the exist
ence of each of which we may not even be aware of in each case. The reason for
this is of course that an intruding or ill-defined variable (in a post-test) may eas
ily co-vary with the effects of one (but not some other) treatment under com-
parison, thus seriously undermining the validity the post-test scores. What we
want in experimental educational research are well-defined treatments and
equally well-defined test functions, strictly attuned to the specification of objec
tives common to the experimental treatments. Only then will we be able to draw
the right conclusions about the effects of the treatments under investigation.
Seen in this perspective, the issue of test dimensionality becomes one of pro
found importance. Latent trait theory is one (but not the only) tool that can be
used in order to establish the characteristics of a language test in this respect.
1.9 Summing up
In sum, then, I would like to argue in favour of exploiting the insights that
have been gained in recent decades in the area of statistical item analysis. For
such purposes as we are discussing here, i.e. the scientific evaluation of language
learning and teaching in a broad perspective, item response theory can no doubt
contribute substantially to the validity of the conclusions we are able to draw on
the basis of our research efforts. There is good reason to believe that the largely
inconclusive methodological experiments of the 1960's and 1970's would have
produced less equivocal results had the measurements of treatment effects been
performed with the rigorous control which we are now, some 25 years later, in a
position to exercise.
Further remarks on the significance of item response theory will be given in
the conclusion of the paper.
2 Assessment of Language Proficiency by Means of Reduced Redundancy

Techniques —Illustration of a Recent Line of Development
The latter part of this paper is devoted to a presentation of a set of tech

niques in language testing which was not available, or at least not widely em
ployed, at the time of the Pennsylvania project (Smith 1970) and other
well-known comparative method studies of the sixties and the early seventies.
My intention in doing this is to illustrate a line of development in language test
ing which has attracted intense interest among theoreticians and practitioners
alike and which, furthermore, has resulted in concrete test models applicable in
language teaching research.
104 MATS OSCARSON
2 1 The Cloze Procedure
The use of the cloze procedure in language testing has grown at a phe
nomenal rate in the last 10 to 15 years. Many major proficiency test batteries in
use today include a cloze part of some sort or other and the technique is quite
commonly used in the ordinary foreign language classroom. Its widespread
popularity derives from the fact that it is relatively easy, even for a layman, to
convert a piece of text into a passable cloze test, and also from the fact that the
scoring is usually fairly simple and straightforward (most notably if one uses the
exact-word principle). The technique is, furthermore, extremely well researched,
and studies testifying to its usefulness abound (for surveys on the cloze and its
various modified forms, see for instance Oiler 1979; Cohen 1980: 89-110).
It is not surprising, therefore, that strong claims have been made for the
value of the cloze procedure. It is sometimes contended, for instance, that a
well-designed cloze measures not only language skills at a relatively low level
(e.g. command of vocabulary, grammar, idioms), but also higher-order skills
such as awareness of "intersentential relationships", global reading comprehen
sion etc. (see for instance Chihara et al. 1977; Bachman 1982; Bensoussan and
Ramraz 1984). Briere and Hinofotis (1979: 12) state that "Regardless of scoring
method, frequency of items deleted, or length of passage, results on a cloze test
correlate highly (usually .70 or better) with overall placement batteries in ESL".
Oiler (1979: 357) tells us that "Ever since Taylor's first studies in 1953, it has
been known that cloze scores were good indices of reading comprehension". It
may be added that the cloze was originally devised as a method for assessing the
readability of texts (Taylor 1953).
However, data that may cast doubt on the cloze as a valid assessment instru
ment are not lacking. Some researchers (e.g. Carroll 1972; Lado 1986) have
questioned the notion that successful performance on cloze tests requires ability
to interpret global text meanings, the implication being that cloze items are es
sentially sentence-bound. Other researchers have tried to define the possible
limit of the range of a cloze task to 5-10 words on either side of the blank. If
such an estimate were to be found valid, it would mean, in effect, that cloze tasks
are often insensitive to discourse constraints across sentence boundaries. Mark-
ham (1987: 309), investigating cloze sensitivity to global comprehension, con
cludes that the cloze procedure does not really assess comprehension at the
macro level: "It does not appear necessary to pay attention to the global cues in
order to complete the deletions". Other studies (for instance Hanzeli 1979) have
pointed to a special problem affecting the cloze, i.e. the difficulty of measuring
control of content words. Certain word classes, notably adjectives and certain
adverbs, are very hard to elicit by means of the deletion technique. Function
words are easier, because they are, as the jargon goes, "subject to local determi-
nacy", i.e. their immediate environment provides the necessary clues for their
substitution.
2.2 The Rational Deletion Cloze—An Investigation
In two studies of my own, I compared native speakers' performance on ra

tional deletion cloze tasks (i.e. cloze tasks involving deletion of "suitable" words
rather than deletion of every nth word) with the performance demonstrated by
non-native students of English in the Swedish Upper Secondary school (Oscar-
son 1986). The object of the comparisons was to obtain construct validation data
on two national standardized tests in English, of which the cloze tasks formed a
part. The native samples consisted of a total of 271 A-level students at four dif
ferent schools in Britain. They were all 16-17 years old and represented, by and
large, the same educational and intellectual stratum as the Swedish sample. The
item responses were scored for contextual appropriateness, i.e. according to the
acceptable-word principle.
Several interesting results emerged from the experiments. First of all it was
found that the native groups were not able to obtain a perfect score on the cloze
part. They averaged 85 per cent of the maximum number of points available.
However, a select group of particularly bright students reached an average score
of 96 per cent correct, which seems to indicate that the test is operative also at a
very high level of performance. Furthermore, the cloze part proved to be a re
liable indicator of overall achievement. With both test batteries — each consist
ing of separate sections measuring mastery of vocabulary, reading
comprehension (passage comprehension as well as comprehension of "mini-
texts"), listening comprehension, and grammar — and for both samples (native
and non-native) the cloze score reflected average performance in that it was on
exactly the same level as the aggregate score. (This was also true of the select
native sub-group mentioned above.) In other words, the cloze type here investi
gated seems to sample a very wide range of ability in the language tested. It may
be added that this piece of evidence is entirely in agreement with our experience
from the test when used for its ordinary purpose, i.e. as part of the national as
sessments in Sweden. Reliabilities (according to the KR20 formula) are invari
ably close to .90 and concurrent validity indices, based on teachers' grades, .65
or thereabouts.
In sum, there is hardly any doubt that the cloze is a very useful technique for
measuring foreign language proficiency, although it has its limitations. It has
proved to be a good measure of "low-level" or fundamental linguistic ability re-
106 MATS OSCARSON
lating to vocabulary mastery and syntactical awareness, and the weight of evi
dence is that it also measures the test-taker's global proficiency in the language
quite well. Therefore, it is applicable in a wide variety of contexts, including lan
guage learning research.
23 The C-Test
The classical cloze of the fixed-ratio deletion type has spawned the develop
ment of a large number of cloze-like variants, e.g. the rational deletion cloze
(see above), the partial dictation test, which involves deletion of portions of re
corded speech, and the cloze-elide test, which involves identification of irrele
vant words inserted in a text (Manning 1987). Probably the most intriguing and
innovative of recent additions to the family of cloze techniques is the so-called
C-test, which was introduced in 1982 (for a comprehensive presentation and sur
vey of research, see Klein-Braley and Raatz 1984; for a review of the technique,
see Carroll 1987).
A C-test is constructed by deleting the second half of every second word in
a number of short texts (usually five or six). Each text is regarded as a "super-
item" and item statistics are not calculated on the basis of performance on indi
vidual tasks (blanks) but on the "super-item" level.
The development of the C-test arose out of the above authors' critical ana
lysis of the assumptions underlying the Cloze, e.g. as regards the extent to which
a set of cloze tasks may be viewed as representing a random sample of the ele
ments of the language and also as regards the general validity of the procedure
(for a comprehensive account of the theoretical justification for the C-test, see
Klein-Braley 1985). Both the classical Cloze and the C-test may be described as
pragmatic and authentic tests in the sense that they use authentic materials as
the basis for item construction, but the originators of the latter test claim that a
better operationalization of the principle of random selection of language ele
ments is achieved with the C-test model. They hold that it tends to sample from
the various language elements more evenly than the Cloze procedure does, and
also that a better representation of "the real language" is achieved owing to the
fact that the test format involves the use of a variety of different text types.
Impressive empirical test data have been reported (see particularly Klein-
Braley and Raatz 1984). Cohen et al. (1984), investigating the possibilities of
adapting the C-Test technique to testing in Hebrew, found that the technique
"appears to be both a reliable and valid measure of general language profi
ciency" (p. 225). The evidence is, therefore, that this relatively new, and as yet
not widely employed, variant of deletion test has a great deal to offer in the way
of effective and reliable scoring.
However, the C-Test seems to suffer at least one disadvantage, namely that
of questionable face validity (a problem which also affects the Cloze, although to
a minor degree). Mutilation of every second word in a text, albeit undertaken
for the sake of as wide coverage of linguistic elements as possible, no doubt re
sults in a product which does not really convey an impression of authentic lan
guage and consequently the researchers' ambition to secure representativity may
in fact prove to be a somewhat self-defeating measure. The question of whether
the C-Test format will be well received in the field, i.e. among teachers and lear
ners, seems to be crucial. In-depth studies of attitudes as well as of test-taking
strategies would seem to be called for (cf. work undertaken by Grotjahn 1986).
Finally, it might be added that further examination of test dimensionality,
for instance by means of latent trait methods (discussed in Section I), will be of
vital importance (some work has already been done, cf. Raatz 1985) in order to
ascertain whether variables other than linguistic ones are at play in C-Test per
formance. Is it possibly the case that some extralinguistic ability (or some very
particular linguistic ability) is helpful in restoring words cut into half, or are the
demands of the task of such a nature that all-round linguistic competence is a
necessary and sufficient prerequisite for successful performance? The test is of
an integrative type and is designed to measure general language proficiency, as
will have become clear from the above account.
3 Concluding Remarks
The aim of my paper has been to exemplify trends in contemporary lan

guage testing theory and practice which may have improved the conditions for
research into foreign language teaching and learning. As for testing theory, I
have chosen to discuss a major addition to our arsenal of statistical tools for ex
tracting information from test data, i.e. latent trait (or item resopnse) theory. As
for practical test design, I have chosen to illustrate the quite heavy impact that
the principle of reduced redundancy testing has had on language tests in the last
couple of decades.
Needless to say, there are many other areas of great topical interest which
might have been considered in this context. I am thinking, in particular, of the
very strong move towards communicative language teaching and learning and
the way in which this is reflected in current testing procedures. This significant
change of emphasis, i.e. away from a predominantly structural approach, poses
particular problems when it comes to the evaluation of attainment and it might
108 MATS OSCARSON
therefore be appropriate to end this paper by trying to relate the described de
velopments in statistical item analysis and testing to the challenge of communi
cative language testing.
The issue is: Can competencies postulated by existing models of communi
cative ability (e.g. that of Canale and Swain 1980) be appropriately dealt with
within the framework of current language testing theory and practice? Thus
stated, the question seems simple enough. However, the real complexity of the
problem becomes apparent if we remind ourselves of what kinds of components
modern models of language competence usually employ in their descriptions.
Typically they include variables such as grammatical competence (including
not just control of structures and rules, but also control of the phonetic system,
of semantics, of lexicon etc.), sociolinguistic competence (including choice of
register, style, conventions etc.), strategic competence (including verbal as well
as non-verbal communication strategies), etc. Assessing the full range of a per
son's abilities in all such domains is of course very difficult, if not impossible, in
an ordinary testing situation and there still remains a great deal of uncertainty as
to what and how to test. Nevertheless, one would probably be justified in saying
that language testers are actually beginning to come to grips with many of the
problems the task involves. The language tester's repertoire is in fact quite im
pressive and here one might point to such developments as are illustrated above,
in spite of the fact that reduced redundancy testing techniques do not possess
real face validity from a communicative point of view. They do contribute, how
ever, to providing a fuller picture of an individual's ability.
With regard to the question of what precisely one may assess by means of
any given test, for instance whether a single dimension is involved in the meas
urements, we can place considerable trust in item response theory as explained
earlier. IRT provides a promising basis for making inroads into better under
standing of what language tests measure. Whether IRT and the notion of latent
traits can be firmly established in a wider theory of language testing is a question
which may yet have to await its final answer. Communicative ability is an elusive
concept which does not easily lend itself to penetrating inquiry and detailed
quantification, even by sophisticated methods, and the work now being under
taken is far from its completion. Having said that, I would still like to reiterate
the main argument of my paper, namely that we have made headway in a great
many areas of foreign language testing and that we should now be in a position
to attack the perennial question of what effective language teaching looks like
with renewed confidence.
Acknowledgement
My thanks are due to Professor Jan-Eric Gustafsson, Gothenburg Univer

sity, and Dr. John de Jong, CITO, Arnhem, who read an earlier version of this
paper and suggested several alterations to the text. Any remaining inaccuracies
are entirely my own.
References
Adams, R J., P.E. Griffin and L. Martin. 1987. "A latent trait method for measuring a dimension
in second language proficiency." Language Testing 4/1.9-27.
Alderson, J. Charles. 1979. "The cloze procedure and proficiency in English as a foreign lan-
guage." Tesol Quarterly 13/2.219-227.
Bachman, Lyle 1982. "The trait structure of cloze test scores." TESOL Quarterly 16.612-670.
Bensoussan, M. and R. Ramraz. 1984. "Testing EFL reading comprehension using a multiple-
choice rational deletion cloze." Modern Language Journal 68/3.230-239.
Briere E.J. and F.B. Hinofotis. 1979. Concepts in Language Testing: Some Recent Studies. TESOL,
Georgetown University, Washington DC. 20057.
Canale, M. and M. Swain. 1980. "Theoretical bases of communicative approaches to second lan-
guage teaching and testing." Applied Linguistics 1/1.1-47.
Carroll, John B. 1972. "Defining Language Comprehension: Some Speculations." Language Com-
prehension and the Acquisition of Knowledge ed. by R.B. Freedle and J.B. Carroll, 1-29.
Washington DC: Winston.
Carroll, John B. 1987. "Review of Klein-Braley, C. and Raatz, E. 1985. 'C-Tests in der Praxis.' in
Fremdsprachen und Hochschule, AKS-Rundbrief 13/14, Bochum: Arbeitskreis Sprachen-
zentrum /AKS/" Language Testing 4/1.99-106.
Chihara, T., J. Oller, K. Weaver and M.A. Chavez-Oiler. 1977. "Are cloze items sensitive to con-
straints across sentences?" Language Learning 27/1.63-69.
Cohen, Andrew D. 1980. Testing Language Ability in the Classroom. Rowley, MA: Newbury
House.
Cohen, A.D., M. Segal and R. Bar-Siman-Tov. 1984. "The C-Test in Hebrew." Language Testing
1/2.221-225.
De Jong, H.A.L. and C.A.W. Glas. 1987. "Validation of listening comprehension tests using item
response theory." Language Testing 4/2.170-194.
Grotjahn, R. 1986. "Test validation and cognitive psychology: Some methodological consider-
ations." Language Testing 3/2.159-185.
Gustafsson, J.E. 1977. The Rasch model for dichotomous items: Theoryt applications and a com-
puter program. ( = Department of Education and Educational Research, University of Göte-
borg, Sweden, Report No. 63.)
Gustafsson, J.E. 1980. "Testing and obtaining fit of data to the Rasch model." British Journal of
Mathematical and Statistical Psychology 33.205-233.
Gustafsson, J.E. 1981. An introduction to Rasch's measurement model. Göteborg, Sweden: Depart-
ment of Education and Educational Research, University of Göteborg.
Hambleton, R.K. and H. Swaminatham. 1985. Item Response Theory: Principles and Applications.
Boston: Kluwer-Nijhoff Publishing.
110 MATS OSCARSON
Hanzeli, Victor E. 1979. "Cloze Tests in French as a Foreign Language: Error analysis." Concepts
in Language Testing: Some Recent Studies ed. by EJ. Briere and F.B. Hinofotis, 3-11. Wash-
ington DC: Teachers of English to Speakers of Other Languages.
Henning, G. 1984. "Advantages of latent trait measurement in language testing." Language Testing
1/2.123-133.
Henning, G. 1987. A Guide to Language Testing: Development, Evaluation, Research. New York:
Newbury House.
Henning, G., T. Hudson and J. Turner. 1985. "Item response theory and the assumption of uni-
dimensionality for language tests." Language Testing 2/2.141-154.
Hughes, A. and D. Porter, eds. 1983. Current Developments in Language Testing. London: Aca-
demic Press.
Klein-Braley, Christine. 1985. "A cloze-up on the C-Test: A study in the construct validation of
authentic tests." Language Testing 2/1.76-104.
Lado, Robert. 1986. "Analysis of native speaker performance on a cloze test." Language Testing
3/2.130-146.
Manning, W.H. 1987. Development of cloze-elide tests of English as a second language ( = TESOL
Research Report, 23.) Princeton, NJ: Educational Testing Service.
Markham, Paul L. 1987. "Rational deletion Cloze processing strategies: ESL and native English."
System 15/3.303-311.
Munby, John. 1979. Communicative Syllabus Design. Cambridge: Cambridge University Press.
Oiler, John W. Jr., ed. 1983. Issues in Language Testing Research. Rowley, MA: Newbury House.
Oiler, John W. Jr. 1979. Language Tests at School: A pragamatic approach. London: Longman.
Oscarson, Mats. 1986. Native and Non-Native Performance on a National Test in English for Swed-
ish Students: A Validation Study ( = Report No. 1986:03, Department of Education and Edu-
cational Research.), Göteborg, Sweden: University of Göteborg.
Palmer, A.S. and L.F. Bachman. 1981. "Basic concerns in test validation." Issues in Language Test-
ing ed. by J.C. Alderson and A. Hughes, 135-151. London: The British Council.
Perkins, K. and L.D. Miller. 1984. "Comparative analysis of English as a second language reading
comprehension data: Classical theory and latent trait measurement." Language Testing
1/1.21-32.
Pollitt, A. and C. Hutchinson. 1987. "Calibrating graded assessments: Rasch partial credit analysis
of performance in writing." Language Testing 4/1.72-92.
Porter, Don. 1983. "The effect of quantity of context on the ability to make linguistic predictions."
Current Developments in Language Testing ed. by A. Hughes and D. Porter, 63-74. London:
Academic Press.
Raatz, U. 1985. "Better theory for better tests?" Language Testing 2/1.60-75.
Rasch, G. 1960. Probabilistic models for some intelligence and attainment tests. Chicago: The
University of Chicago Press.
Smith, P.D. 1970. A Comparison of the Cognitive and Audiolingual Approaches to Foreign Lan-
guage Instruction: The Pennsylvania Foreign Language Project. Philadelphia: The Center for
Curriculum Development.
Taylor, W.L. 1953. "Cloze procedure: A new tool for measuring readability." Journalism Quarterly
30.415-433.
Willmott, A.S. and D.E. Fowles. 1974. The Objective Interpretation of Test Performance: The Rasch
Model Applied. Slough, Bucks.: NFER Publishing Company.
Woods, A. and R. Baker. 1985. "Item response theory." Language Testing 2/2.118-140.
Wright, B.D. 1977. "Solving measurement problems with the Rasch model." Journal of Educa-
tional Measurement 14/2.97-116.
Wright, B.D. and M.H. Stone. 1979. Best Test Design: RaschMeasurement.Chicago: MESA Press.
Section III—Teaching Environments
Introduction to the Section on Teaching
Environments
Kees de Bot
One of the questions most frequently put to researchers in applied linguis

tics is: "What is the best way to teach a foreign language." This question is both
valid and embarrassing. It is valid because there is a formidable amount of re
search on various aspects of language learning and teaching, which would sug
gest, and not just for the layman, that such a basic question has been answered
long ago. There is an almost booming market of new approaches and methods
that share a promise of quick success for language learners. It could be inferred
that all these new approaches have evolved from basic research on successful
ways of teaching. It is not easy to explain why this is not true. The question is
embarrassing because it can not be answered at this moment, and will not be
answerable in the foreseeable future.
The papers in this section aim at elucidating why it is so hard to solve this
problem. The complexity of the question as it is stated above becomes obvious if
one tries to reformulate the question into testable hypotheses.
A first problem is that there is considerable confusion with respect to the
terms and definitions used, and how the terms used relate to each other. Brumfit
shows in his paper that there are (at least) three different levels: "approaches,"
"methods" and "techniques." The highest, most general educational level is the
"approach," an axiomatic construct which defines the "methods," the sets of
ideas that are implemented in teaching "techniques." All papers in this section,
and several in other sections, show that these three terms tend to be used unsys-
116 KEES DE BOT
tematically in discussions on language teaching. This is very obvious in recent

discussions on bilingual education in a number of countries: in the hot debates
on this issue it is very often unclear whether arguments pro and con refer to bil
ingual education as an approach or as a technique.
A second problem is that the relation between effectiveness defined as the
acquisition of a given amount of foreign language knowledge/skills within a cer
tain amount of time, and the characteristics of the approaches, methods and
techniques tends to be oversimplified. In fact, there are two levels to be taken
into account: a teacher level and a learner level:
teacher level: "methods" - "techniques"
learner level: "classroom behaviour" → "learning"
In the early literature on comparison of methods discussed in the papers in

this section, "methods" and "learning" are supposed to be causally related.
Learning would take place by virtue of the methods used. More recent literature
has shown that the relation between "methods" and "techniques" is already a
fairly complex one: different methods appear to be using the same techniques,
while the number of different techniques used and defended by proponents of
the same method is quite large.
In addition, the methods and techniques are linked by teachers" activities. It
is a well known finding nowadays that methods are not simply implemented by
teachers. Teachers have their own interpretation of a method, which may have
little in common with the original ideas behind a method. The behaviour of tea
chers in their classrooms is guided by many factors, and the methods used is not
by definition the most important one. The inclusion of the variation caused by
this weak relation between methods and techniques is certainly an improvement
in this kind of research.
Another improvement in recent studies over the older ones is that the link
between "techniques" and "classroom behaviour" has been recognised as a cru
cial one in the teaching process. A number of systems for observation in foreign
language classrooms have been developed, and some of these, like TALOS,
monitor not just the teacher's behaviour, but also the activities of the learners.
It is becoming almost painfully clear that learners "waste" a lot of time. They are
engaged in many activities during schooltime, sometimes even the kinds of acti
vities prescribed by the "techniques" used by the teacher. Yet research using
time-sampling techniques, in which a very detailed quantitative descriptions can
INTRODUCTION 117
be given of the learners' activities, makes it clear that the amount of time effec
tively used for learning is quite small. At the same time, we do not know
whether it is at all possible or even effective to have learners focussed on their
learning activity all the time. Maybe gazing out of the window is a perfect way to
digest new information.
In his contribution, Allwright proposes to focus research on this particular
part of the process: the way in which learners define their own, probably idiosyn
cratic learning environment.
The relation between "methods" and "learning" is further weakened by the
link between "classroom activities" and "learning." There is, as yet, simply no
way to get to know to what extent certain activities lead to the changes in cere
bral activity we tend to call learning. Recent work on ERP's (Event related
potentials, low-voltage, but detectable cerebral activity that appears to be re
lated to certain types of activities and stimuli) definitely has some potential, but
this type of research is still in its infancy, and the way in which data are gathered
at present does not lend itself particularly to classroom research: a class full of
adolescents sitting motionlessly with their heads covered with electrodes could
be the ultimate dream of a tired teacher, but it is certainly not a ecologically
valid research environment.
One of the aspects that have not been explored in any detail is what the long
term effects of different methods are. Longitudinal research is still to be done. It
is conceivable that certain methods are relatively successful in having immediate
effects, while other methods may be less successful in the short run, while lead
ing to better retention over the years. As pointed out by Van Els et al. (this vol
ume) the effectiveness of a method has to be related to the goals set for foreign
language teaching. In some cases short term success is sufficient, but in general
the aims are more far reaching in time.
In her paper Mitchell stresses the importance of both the goals of the teach
ing method and the goals of the evaluation of that method. Evaluation does not
take place in a political vacuum, but rather, the research will be interpreted by
those involved: the politicians, the teachers, and sometimes the learners or their
parents. This implies that a certain transparency and face-validity of the metho
dology is called for. Outcomes that are "mere numerals" are unlikely to change
politicians' decision making or parents' attitudes.
In research on foreign language teaching, the same type of methodological
discussions takes place as in other educational research. One of the bones of
contention is the validity of (quasi-)experimental designs. From the authors in
this subsection Larsen-Freeman is clearly more in favour of this type of research
than the other authors. She stresses the importance of process-oriented data-
gathering, but does not reject the application of experimental designs as such.
118 KEES DE BOT
Mitchell advocates evaluation studies with a non-experimental but many-faceted

design. Her main argument is that by their very nature, experimental design
studies will miss too much relevant information.
The papers in this section are clearly related to the other papers in this
book through their emphasis on the context in which teaching and learning takes
place. In a sense teaching environments are just one type of learning environ
ment, if we take the perspective of the learner. Allwright raises the point that
given the learners' idiosyncratic behaviour, it may even be very difficult to show
that a learning environment which includes a teacher is that different from envi
ronments in which there is no teaching at all. But there is also the perspective of
the teacher. As Larsen-Freeman says in her paper, the teachers are the agents in
the learning/teaching process, and we need to know what teachers do and why
they do it.
Research on Language Teaching Methodologies: A
Review of the Past and an Agenda for the Future
Diane Larsen-Freeman
The Audio-Lingual Method (ALM) dominated the language teaching scene

in America during the middle of the century. Owing its birth to America's need
for foreign language speakers during World War II, the ALM went unrivaled for
almost three decades. Since its loss of favor in the early 1970s, no single metho
dology has been as widely practiced. This is not to say that there are no conten
ders. At least six "innovative" methodologies have been promulgated in
America during the past twenty years. Detailed descriptions of each of these
have been provided elsewhere (Larsen-Freeman 1986) and will not be repeated
here. What should be noted, however, is that while for better or worse, the ALM
was grounded in the prevailing linguistic and psychological theories of its time,
no innovative methodology can claim the same. Rather it is the case that for al
most every theoretical principle upon which some current methodological prac
tice is based, there exists a contrary principle underlying some other current
methodological practice. To cite but a few of the contradictions:
Role of the Teacher

The role of the teacher in Suggestopedia is that of an authority figure in
whom the students feel sufficient trust to enter a state of infantilization. In Com
munity Language Learning, the teacher is a counselor, or facilitator of students'
learning; in the Silent Way, the teacher is silent as much as possible so that it is
the students who are manipulating the language and thus taking responsibility
for their learning.
120 DIANE LARSEN-FREEMAN
Language Focus
In the Natural Approach, what is important is that the language the teacher
uses is comprehensible. In other words, structural diversity is permitted as long
as what is being transmitted by the teacher is understood. This is not the case
with the Silent Way. With the Silent Way, there is both structural grading3 and a
restricted functional vocabulary, at least at the beginning levels. No such linguis
tic constraints are placed on what is taught the students in a Community Lan
guage Learning class. In fact, it is the students who determine the syllabus
indicating what it is they wish to learn of the target language (TL) by having con
versations in their native tongue which are subsequently translated into the TL.
Linguistic structures receive little attention from students in a course where the
Communicative Approach is being practiced. Instead, students are engaged in
using the language, and thus practicing the functions to which the language is
put.
Use of the Students' Native Language

Whether or not to use the students' native language (NL) in the classroom
has been a controversial issue among language educators for centuries. There
still exists no consensus. Community Language Learning teachers draw freely
upon the students' NL in the classroom for the purpose of making meaning
clear. While Silent Way teachers do not necessarily use the students' NL in the
classroom, much of their teaching is cast from a contrastive perspective, building
upon what their students already know, i.e. their NL. Total Physical Response
(TPR) teachers avoid use of the students' NL in the classroom, making meaning
clear through physical actions and gestures. In fact, Asher, the originator of the
TPR methodology, explicitly eschews use of the NL and criticizes Community
Language Learning for invoking it "which slows learning for beginners" (Asher
1984).
Citing these discrepancies among modern language teaching methodologies

should not obscure the fact that there are also commonalities among them (see
e.g. Larsen-Freeman 1987). Rather, I have chosen to highlight the divergent
thinking which exists in the field to dramatize the need for empirical research to
increase our understanding of the teaching/learning process. This call for re
search should not be misconstrued. I am not suggesting that the desirable out
come of research would be the coalescing of the divergent thinking and the
subsequent adoption of a single panacean methodology. For one thing, I serious
ly doubt that there is a single methodology optimal for all teachers, students and
situations. Even if there were, it is unlikely that it would be fail-proof, as we
know that methodological prescriptions are subject to widely different interpre-
RESEARCH ON LANGUAGE TEACHING METHODOLOGIES 121
tations and applications by practitioners (cf. 1.1 below). It is also doubtful that
empirical research would yield unequivocal results indicating the superiority of
one methodology over another. Certainly this has not been the case to date (cf.
1.1 below). I do believe, however, that there are some instructional practices
which are superior for certain purposes and for certain teachers and students
and where there is divergence from these in practice, we should be able to ex
plain the differences in terms of learning outcomes. I will elaborate on this point
in 2.1 below. First, though, I will summarize what research has been conducted
on teaching methodologies and related matters. Following that review, I will
propose two categories of investigation which I believe should be included on a
future research agenda.
1 Summary of Empirical Research
1.1 Global Methodological Comparisons
Early empirical research centered on language teaching methodologies in

volved large-scale comparative studies. Agard and Dunkel (1948) at the Univer
sity of Chicago were among the first to compare the "new type" (i.e. ALM) of
language teaching methodology with that of the more traditional grammar-
translation method. Other studies in this vein, comparing ALM to grammar-
translation or more cognitive methodologies, were the Scherer-Wertheimer
(1964) experiment involving the teaching of German at the University of Colo
rado, the Pennsylvania Project involving the teaching of German and French at
the secondary school level throughout the state (Smith 1970), and in Swedish
high schools and adult education classes, the Gothenburg English Teaching
Method (GUME) Project (Levin 1972). The results of each of these studies
proved inconclusive ; neither methodology was determined to be superior over
all. The findings were not only disappointing, but also unpopular. Stern (1983),
for instance, reports that the Pennsylvania study was attacked because it did not
demonstrate that the then innovative ALM was superior to its predecessor.
Two primary and somewhat overlapping explanations have been offered for
the findings of non-significance between the two methodologies in comparison
studies. The first, a question of research methodology, the second questioning
the construct of language teaching methodology itself. With regards to the for
mer, Long (1980: 1) has stated the concern most succinctly:
"In addition to many of the other problems—history, mortality, the Haw

thorne effect, and so on—inherent in methodological comparisons of this
kind, studies like these tend to suffer from the investigators' inability to con
trol what goes on inside the classroom. There is, after all, no classroom obser
vational component in the data collection for this kind of research..."
Without an observational component, there is no guarantee that what tea

chers are doing is consistent with the methodological principles they purport to
put into practice. To cite one example, Swaffar, Arens and Morgan (1982), after
having given explicit guidance to German teachers who were supposed to follow
either "rationalist" or "empiricist" approaches, concluded that the distinctions
between the approaches were possible to draw in the abstract but were not con-
firmable in classroom practice.
As already mentioned, the second serious problem with the methodological
comparisons rests with the construct of methodology. A methodology consists of
a constellation of activities, techniques or procedures which are manifestations
of certain principles. Thus, a repetition drill, a technique associated with the
ALM, would be an exemplar of the application of the principle of language ac
quisition being a product of habit formation. While activities/techniques/proce
dures are often called for by a particular methodologist, exactly how they are to
be carried out, when, with what frequency, etc. is not fixed. Moreover, some
methodologies are long on principles, but short on activities. Thus, the class-
room teacher is given considerable latitude in interpreting and implementing a
given methodology. Spada (1986) found, for instance, in her study of the Com
municative Approach (CA), that instructors were not always implementing the
CA in the same way. In one of the three classes in her study, the students re
ceived more explicit grammar practice on the formal features of the language
than the other two. This type of practice is one that has been deliberately dis
couraged by advocates of the CA and would be one that would threaten the in
ternal validity of any attempt to compare the efficacy of the CA with more
traditional form-focused methodologies.
While the fact that the teachers are left to interpret and adapt a methodo
logy to suit their styles and students is laudable5, it does call into question
whether or not it is global methodologies we should be comparing, or more lo
calized patterns.
1.2 Classroom Process Description
It will be recalled that the first objection to the methodological comparison

studies was that they precluded an observational component which would allow
the observer to judge whether or not certain methodologies were being carried
out as the researcher envisaged. The need for systematic observation of what
was actually transpiring within a classroom led to more narrowly focused class-
room-centered observations and the concomitant development of observation
schedules (see Hatch and Long 1980; Allwright 1981 for discussion) specifically
geared for language teaching such as Moskowitz's (1971) FLint and Fanselow's
(1977) FOCUS, Ullman and Geva's (1982) TALOS and Allen, Frohlich and
Spada's (1984) COLT.
Another response to the objection has been the carrying-out of a number of
very focused analytic studies of instructional procedures. Long (1987: 100-101)
offers a sampling:
"... teacher question types and their effects on student production, turn-taking
systems, language use in lockstep and small-group work, simplification and ela
boration in teacher speech, ethnic styles in classroom discourse, relationships
between practice and achievement, teacher feedback on learner error, rela
tionships between task types and student production and negotiation work
and between affective factors and classroom participation".
Researchers conducting these studies have deliberately avoided the whole

sale methodological comparisons, opting instead for reliable descriptions of
more narrowly defined classroom exchanges, configurations, styles, participation
patterns, etc. While such studies are well-motivated and provide a valuable start
ing point, they are essentially descriptive and provide no empirical justification
for recommending alterations in classroom practice. In fact, as Long (1987) has
pointed out, in most of the research cited, no student achievement data were ga
thered; thus, this type of research represented the exact converse of global
methodological comparisons in which product data were compared without re
course to process.
Another more serious drawback of these studies is their lack of theoretical
motivation. Theoretically motivated studies are important for two reasons. As
Long (1987) observes, findings from research which lacks a theoretical starting
point often leave us with the inability to generalize beyond the particular results
of the study in question. Moreover, there is the danger that unless the study is
theoretically motivated, results from studies which have preceded it will be ig-
nored. With idiosyncratic, theoretically ungrounded research, little is accom

plished to deepen our understanding of the teaching/learning process.
1.3 Methodological Features
Perhaps out of recognition of the second problem with comparative metho

dological studies, that methodologies are too abstract to be globally compared,
there have been several studies which are more restricted in scope and focus on
a particular feature or cluster of features of methodologies.
Asher (1969, 1972), Postovsky (1970) and Gary (1975) were responsible for
studies designed to test the effectiveness of presenting students with new materi
al only in the auditory mode initially. In general, results from these studies indi
cated that when students were enrolled in courses with initial delayed oral
responses, not only was their comprehension superior to other students that
were expected to produce language from the first day on, but also that the listen
ing training contributed to superior oral fluency.
Wagner and Tilney (1983) put a key feature of superlearning (an adaptation
of Suggestopedia) to an empirical test. They taught twenty-one adult English
speakers a German vocabulary list containing 300 words for a period of five
weeks. Seven adults received special instruction with accompanying Baroque
music, use of such music being an integral part of Suggestopedia. Seven other
adults were instructed in the same manner without music. The third group of
seven adults served as a control group and studied vocabulary through tradi
tional rote means. During the five-week period, all three groups were tested on
three different occasions. It was found that the control group's scores were signi
ficantly higher than those of the experimental groups.
The third study to be discussed in this section is one portion of a large five-
year research project undertaken by the Modern Language Centre of the Onta
rio Institute for Studies in Education (OISE). The particular study referred to
here was designed to examine the relationship between instructional practices
and the development of proficiency in a second language. Using the COLT
(Communicative Orientation of Language Teaching) observation schedule, the
OISE researchers observed eight core French classes in Toronto. The eight
classes in the sample were ranked on an experiential-analytic scale; Type A
(analytic) classrooms made significantly more use than Type E (experiential)
classrooms of the following (T = teacher; S = students):
topic control by teachers

minimal written texts (S)
minimal utterance in spoken interaction (S)

reaction to code rather than message (S)
restricted choice of linguistic item (S)
Type E classrooms made significantly more use than Type A classrooms of

the following features:
topic control by students

extended written text (S)
sustained speech in spoken interaction (S)
reaction to message rather than code (T,S)
topic expansion (S)
use of student-made materials
(Harley, Allen, Cummins and Swain 1987:67)
Given the difference between Type A and Type E classrooms, the re
searchers hypothesized about the extent to which the differences would contrib
ute to differences in student knowledge and performance in the eight classrooms
under investigation. These hypotheses were then tested by analyzing student
performance on measures of their grammar, discourse, sociolinguistic com
petence and listening skills in French.
In actual fact, only two of the eight classrooms were determined to have ex
periential orientations according to their overall COLT score and even these
were termed "relatively" experiential as opposed to relatively analytical. None
theless, the researchers report that their most striking finding was the extent to
which the two different types of instruction were indistinguishable. None of the
differences between groups on adjusted post-test scores was significant, al
though the difference between the analytical and experiential groups in favor of
the former nearly reached significance on the grammar multiple-choice written
test. When the two most analytical classrooms were compared with the two ex
periential classrooms, more significant differences emerged; however, on most
of the sub-tests, the two groups performed similarly. Moreover, when the total
gain in proficiency was calculated for each class over the year, the one experien
tial class made the highest gain in overall proficiency and the other experiential
class made the lowest gain of the eight classes. Although these results seem
counter-intuitive, they may be explicable either by pointing to research metho
dological problems, or by considering the fact that there is more to language
learning success than the actual practices which are implemented. More will be
said about this below.
From the preceding review of the empirical research in the area of teaching
methodologies, it seems clear that in order to promote our understanding of the
teaching/learning process, future research should not attempt to compare meth
odologies on a global level, but rather should focus on more local practices. An
other requirement should be that research designs include both process (what is
actually happening in the classroom) and product (what the learning outcomes
are) with an observational component built in to verify that the former is pro
ceeding as planned. Furthermore, the research should be theoretically moti
vated in order to contribute to a coherent, rather than fragmented, view of the
teaching/learning process.
2 Two Areas for a Future Research Agenda
2 1 Process/Product Studies
Process-product studies which focus on classroom practices and learning

outcomes, which are theoretically motivated so that one study can be related to
another and which operate at a sub-global level would be very welcome indeed.
Moreover, if these studies dealt with optimal intervention points, low-inference
high-frequency behaviors, which can be manipulated by the teachers and stu
dents so that the findings can be applied to language teaching and teacher train
ing, the field would be well-served (Long and Crookes 1986). A number of
topics for such process/product studies have been suggested in another docu
ment (Larsen-Freeman and Long 1988) and thus will not be repeated here. In
stead, a study which is illustrative of the type being called for will be discussed.
Adopting the information-processing perspective of cognitive psychologists,
Hulstijn (1989) conducted two experiments to investigate the differential effects
of instruction when student attention is directed towards form or meaning or
both. Hulstijn reasoned that if students are able to rely solely on a top-down se
mantic processing strategy when dealing with new TL input, they will. Further, if
the input is always made comprehensible for the students, a standard practice of
Natural Approach teachers (Krashen and Terrell 1983), a top-down semantic
strategy should suffice. A consequence of student reliance on semantic decoding
alone may be that the formal features of the language will receive too little at
tention to be acquired.
In one of the experiments Hulstijn conducted, 80 high school students who
were native speakers of Dutch were engaged in learning sentences containing
Dutch content words (to control for prior knowledge) but also marked by artifi-
cial formal features (morphemes, function words, subclause word order). The
subjects were divided into four groups depending on the orientation of their in
struction: form only, meaning only, both form and meaning and a control group
which was given the pre and post tests, but worked on an unrelated task during
the learning time allotted the other groups. The other three groups each were
given a different task depending upon its focus. For example, the form-focused
group worked on an anagram task, while the meaning-focused group registered
their opinion about the issues raised in the sentences. The subjects were given
cued recall tests and a sentence copying test which was administered both before
and after the experimental treatment.
From the results, Hulstijn was able to determine that attention to form was
sufficient for implicit learning of the structural features to take place. However,
he only obtained modest evidence to support the claim that focus on meaning
inhibits the acquisition of the formal features.
Although this study may not be unique in meeting the characteristics it is
desirable for process/product studies to have, it does address all three. It is
targeted at a sub-global level, it considers both process and product and it is the
oretically motivated. Moreover, it deals with clear intervention points (e.g. at
tention to form) which can make an instructional difference.
One would not expect to find from such studies that certain teaching prac
tices are intrinsically "good" or "bad" for all learners. Depending upon the
learning outcomes intended, different practices may be exploited. Moreover, for
a particular developmental point, certain practices may be more efficient than
others. As Politzer,s (1970) study indicates, there is likely to be a curvilinear, not
linear, correlation between student achievement and teaching practices. Certain
practices may be positively correlated with student achievement sometimes,
neutral, or even negatively correlated with student achievement at others.
If optimally-timed and optimally-focused instructional practices do make a
difference, as seems so intuitively obvious, then the type of process-product
study called for here should be illuminative. However, despite their obvious
merit, process-product studies should not be the only nominee to a research
agenda. As was alluded to several times already, there is more influencing suc
cess in language learning than the actual practices which are employed. Indeed,
as we have seen with some of the studies mentioned earlier, no matter how
worthy the practices are which process-product research supports, teachers do
not always put them into practice in the manner prescribed. Rather than des
pairing at such behavior, it would be worth our while to encourage research in
itiatives which examine how the agent in the instructional methodology, the
teacher, influences the teaching/learning process.
2.2 Language Teacher Studies
In retrospect, it was naive to assume that teachers would put methodologies

into practice without modification. For one thing, teachers have many concur
rent and often competing demands with which to contend at any one time.
Prahbu (1987:103) has put the matter most cogently:
"What a teacher does in the classroom is not solely, or even primarily, deter
mined by the teaching method he or she intends to follow. There is a complex
of other forces at play, in varied forms and degrees. There is often a desire to
conform to prevalent patterns of teacher behavior, if only for the sense of se
curity such conformity provides. There is also a sense of loyalty to the past
both to the pattern of teaching which the teacher experienced when he or she
was a student and to the pattern of his or her own teaching in the past...
There is the teacher's self-image and a need to maintain status in relation to
colleagues or authorities. Above all, there is a relationship to maintain with a
class of learners, involving factors such as interpretations of attitudes and feel
ings, anxieties about maintaining status or popularity, and fears about loss of
face".
In addition to the competing demands with which the teacher must cope, a
responsible teacher will alter methodological practices simply to meet the lear
ners' needs at the time. As frustrated as we might be when the teachers deviate
from what they are supposed to do during our experiments, we would experi
ence even more frustration if we were students in a class where a teacher ad
hered rigidly to a specific methodological practice when we students were
unresponsive, bored or hopelessly lost. Thus, it is fallacious to view teachers as
mere "conveyor belts" (Lim 1988), delivering language through inflexible prac
tice.
What empirical research has been conducted on the role of the teacher has
been limited almost primarily to describing the speech teachers use in address
ing learners, questioning them and giving feedback. What this brief list dra
matizes as Woods (1988) acknowledges, is that we know very little about what
teachers actually do, although there is no dearth of materials telling teachers
what they should do. Another explanation for the teacher's failure to heed the
"shoulds" or to consistently apply methodological principles is that methodolog-
ists (and one could easily include researchers and even language teacher educa
tors in this group) do not necessarily conceptualize teaching practice in the same
way as teachers do. If we are to generate knowledge that is to have a positive im
pact on pedagogical practice, then we must formulate our inquiries in ways that
are more compatible with teachers' perspectives (Bolster 1983).
In an attempt to understand better why teachers do what they do, Woods

(1988) has conducted an ethnographic study of the basis upon which teachers
make their moment to moment decisions. Woods videotaped teachers in classes
and afterwards viewed the tape with the teachers who stopped the tape to com
ment upon the decisions they were making at the time. What Woods discovered
is that there is an incredible complexity of factors which teachers consider when
making decisions. A partial list includes: their explicit lesson plan, the classroom
routines they have built up, the amount of time they have invested in lesson
preparation, the discourse preceding a decision point, their estimation of stu
dent attention, how much time is left, what remains to be done, what the teacher
has just said, what kind of students they have, what the curriculum and materials
dictate, etc. (Woods 1988). Furthermore, there appears to be a hierarchy among
these factors, although the hierarchy may not be strictly adhered to in that the
teacher's previous decisions constrain to some extent subsequent decisions to be
made.
In sum, individualization of lesson implementation happens as much by tea
chers as it does by students. It would be helpful to know more about this process.
One hypothesis in need of further study is Larsen-Freeman and Celce-Murcia's
(1985) claim that the teaching process is dynamic and that the most effective de
cisions will be made by teachers who choose teaching practices which are
matched for both the challenge the particular teaching point offers and where
the students are at the moment. Prahbu (1988) appears to share a similar per
spective. It is his view that language teaching materials should encourage alter
ation by teachers so teachers can be responsive to the needs of their students as
they arise. This sentiment is what Politzer (1970: 42) concluded his article with:
"the 'good' teacher is the one who can make the right judgement as to what
teaching device is the most valuable at any given moment".
Thus, studies of the decisions which teachers make and why they do so is my
second nominee for a research agenda. Not only would language teaching prac
tice and teacher education potentially benefit from such research, but eventually
findings might contribute to a theory of language teaching, which the field sorely
needs.
3 Conclusion
I have claimed that there is a great number of incongruities among practices

associated with innovative language teaching methodologies. I have further re
viewed the empirical research that has been carried out on methodologies and
related matters. Taken as a whole, I think it is fair to say that very little resolu-
tion of the incongruities has thus far taken place. A research agenda should,
therefore, include process-product studies which attempt to resolve the contra
dictions, not through homogenization of practice, but rather in linking a specific
practice with particular learning outcomes, depending upon the audience.
There should also, however, be room on the agenda for investigating the
role of the agents in the teaching/learning process. We cannot assume that tea
chers are mere conduits from methodologists to students. We not only need to
know what teachers do, but also why they do it.
Ultimately, of course, we must be able to weave all the strands together:
teaching, learning, teacher, learner, materials, context. Until that time, however,
there is much groundwork to be laid.
Notes
1. The six include The Silent Way, Suggestopedia, Community Language Learning, Total
Physical Response, The Communicative Approach and the Natural Approach.
2. Here the term theoretical is used in a broad and generic sense following Stern (1983: 26)
who views each language teaching methodology as a different theory of teaching.
3. Structural grading does not mean a foreordained sequence. The Silent Way teacher assumes
the responsibility for moving from one structure to the logical next, depending upon the
needs of a particular group of students with whom the teacher is working.
4. It is worth noting, however, that in a replication study in Sweden, significant differences in
favor of the Explicit Method (Cognitive-code approach) over the Implicit Method (ALM)
were found when only adults were the subjects (Oskarsson 1973).
5. Indeed the last thing we would want to do is to chastise the teacher who was not being meth-
odologically chaste. See 3.2 for further discussion.
6. Of course, I could make (and have made, Larsen-Freeman 1983) the same case for studying
the other agent in the process, the learner, who some might argue has an equal or even more
important role to play in the process than the teacher (see, for example, Breen and Candlin
1980; Allwright 1981). As this paper is supposed to deal with teaching methodologies, how-
ever, I will leave it to others to make that case.
7. It is interesting to note that Chaudron (1988) devotes two chapters to the agents in the
teaching/learning process. One chapter is entitled "Learner Behavior", the other simply
"Teacher Talk". Chaudron, himself, explains that "In general in L2 research, learners have
been conceived of as much more 'whole' persons than teachers..."
References
Agard, F. and Dunkel, H. 1948.An Investigation of Second-Language Teaching. Boston: Ginn.
Allen, J., M. Frohlich and N. Spada. 1984. "The communicative orientation of language teaching:
an observation scheme." On TESOL '83 ed. by J. Handscombe, R. Orem and B. Taylor, 231-
252. Washington, DC: TESOL.
RESEARCH ON LANGUAGE TEACHING METHODOLOGIES 133.
Aliwright, R. 1981. "What do we want teaching materials for?" ELT Journal 36/1.5-18.
Asher, J. 1969. "The total response approach to second language learning." The Modern Language
Journal 53/1.3-7.
Asher, J. 1972. "Children's first language as a model for second language learning." The Modern
Language Journal 56/3.133-139.
Asher, J. 1984. "The total physical response: some guidelines for evaluation." Paper presented at
the 1984 Milwaukee Symposium on Current Approaches to Second Language Acquisition.
Bolster, A. 1983. "Toward a more effective model of research on teaching." Harvard Educational
Review 53/3.294-308.
Breen, M. and C. Candlin. 1980. "The essentials of a communicative curriculum in language
teaching" Applied Linguistics 1/2.89-112.
Chaudron, 1988. Second Language Classrooms: Research on Teaching and Learning. Cambridge:
Fanselow, J. 1977. "Beyond Rashomon — conceptualizing and describing the teaching act."
TESOL Quarterly 11/1.17-39.
Gary, J.0.1975. "Delayed oral practice in initial stages of second language learning." New Direc-
tions in Second Language Learning Teaching and Bilingual Education ed. by Burt, M. and H.
Dulay, 89-95. Washington, DC: TESOL.
Harley, B., Allen, P., Cummins, J. and M. Swain. 1987. Tlie Development of Bilingual Proficiency,
Final Report, Volume II: Classroom Treatment. Toronto: The Ontario Institute for Studies in
Education.
Hatch, E. and M. Long. 1980. "Discourse analysis, what's that?" Discourse Analysis in Second
Language Research ed. by D. Larsen-Freeman, 1-40. Rowley, MA: Newbury House Publish-
ers.
Hulstijn, J. 1989. "Implicit and incidental second language learning: Experiments in the process-
ing of natural and partly artificial input." To appear in Interlingual Processes ( = Language in
Performance, 1) ed. by H. Dechert and M. Raupach. Tübingen: Gunter Narr Verlag.
Krashen, S. 1981. Second Language Acquisition and Second Language Learning. Oxford: Perga-
mon Press.
Krashen, S. and T. Terrell. 1983. The Natural Approach. Oxford: Pergamon Press.
Larsen-Freeman, D., ed. 1980. Discourse Analysis in Second Language Research. Rowley, MA:
Newbury House Publishers.
Larsen-Freeman, D. 1983. "Second language acquisition: getting the whole picture." Second Lan-
guage Acquisition Studies ed. by K. Bailey, M. Long and S. Peck, 3-22. Rowley, MA: New-
bury House Publishers.
Larsen-Freeman, D. 1986. Techniques and Principles in Language Teaching. New York: Oxford
University Press.
Larsen-Freeman, D. 1987. "Recent innovations in language teaching methodology." The Annals of
the American Academy of Political and Social Science 490.51-69.
Larsen-Freeman, D. and M. Celce-Murcia. 1985. "Defining the challenge: an additional choice in
language teaching." A paper presented at the 1985 TESOL Convention, New York City.
Larsen-Freeman, D. and M. Long. 1988. "Research priorities in foreign language learning and
teaching." A paper prepared for the National Foreign Language Center, The Johns Hopkins
School for Advanced International Studies, Washington, DC.
Levin, L. 1972. Comparative Studies in Foreign Language Teaching: The GUME Project. Stock-
holm: Almquist and Wiksell.
Lim, C. 1988. "Producing instructional materials in the Singapore setting." A paper presented at
the 1988 RELC Seminar, 11-15 April 1988, Singapore.
Long, M. 1980. "Inside the 'black box': methodological issues in classroom research on language
learning." Language Learning 30/1.1-42.
Long, M. 1987. "The experimental classroom." The Annals of the American Academy of Political
and Social Science 490.97-109.
Long, M. and G. Crookes. 1986. "Intervention points in second language classroom processes." A
paper presented at the 1986 RELC Seminar, 21-25 April 1986, Singapore.
Moskowitz, G. 1971. "Interactional analysis — a new modern language for supervisors." Foreign
Language Annals 5/2.211-221.
Oskarsson, M. 1973. "Assessing the relative effectiveness of two methods of teaching English to
adults." IRAL 11/3.251-262.
Politzer, R. 1970. "Some reflections on 'good' and 'bad' language teaching behaviors." Language
Learning 20/1.31-43.
Postovsky, V. 1970. "The effects of delay at the beginning of second language teaching." Unpub-
lished doctoral dissertation. Berkely, CA: University of California.
Prahbu, N. 1987. Second Language Pedagogy. Oxford: Oxford University Press.
Prahbu, N. 1988. "Materials as support: materials as constraint." A paper presented at the 1988
RELC Seminar, 11-15 April 1988, Singapore.
Scherer, G. and M. Wertheimer. 1964. A Psycholinguistic Experiment in Foreign-Language Teach-
ing. New York: McGraw-Hill.
Smith, P. 1970. A Comparison of the Cognitive and Audiolingual Approaches to Foreign Language
Instruction: The Pennsylvania Foreign Language Project. Philadelphia: Center for Curriculum
Development.
Spada, N. 1986. "The interaction between type of contact and type of instruction: some effects on
the L2 proficiency of adult learners." Studies in Second Language Acquisition 8/2.181-199.
Stern, H. 1983. Fundamental Concepts of Language teaching. Oxford: Oxford University Press.
Swaffar, L., Arens, K. and M. Morgan. 1982. "Teacher classroom practice: redefining method as
task hierarchy." Modem Language Journal 66/1.24-33.
Ullman, R. and E. Geva. 1982. Classroom observation in the L2 setting: a dimension of program
evaluation. Modern Language Centre, Ontario Institute for Studies in Education (Mimeo).
Wagner, M. and G. Tilney. 1983. "The effect of 'superlearning techniques' on the vocabulary ac-
quisition and alpha brainwave production of language learners." TESOL Quarterly 17/1.5-19.
Woods, D. 1988. "Teachers' interpretations of language teaching materials." A paper presented at
the 1988 RELC Seminar, 11-15 April 1988, Singapore.
Problems in Defining Instructional Methodologies
Christopher Brumfit
1 "Method" in language teaching
The terms "methods", "teaching method", and "methodology" are striking

by their absence from the general educational literature. None of them is
defined in a recent dictionary of educational terms (Gordon and Lawton 1984),
and they do not appear in the index of substantial surveys of research in educa
tion (e.g. Dunkin and Biddle 1974; Suppes 1978; or Wittrock 1986, where the
index entry "teaching methods" refers solely to a section on techniques used in
higher education).
Where the term "teaching methods" is used, as in Dunkin (1987) it refers to
highly general procedures applicable to most teaching situations, for example
"the Socratic method".
In discussions of foreign language teaching, however, the terms are used
fairly widely. A recent definition of "method" gives:
"(in language teaching) a way of teaching a language which is based on syste
matic principles and procedures, i.e. which is an application of views on how a
language is best taught and learned" (Richards, Platt and Weber 1985:176).
In what has come to be the classic definition, Anthony (1963) relates a

broad "approach" ("a set of correlative assumptions dealing with the nature of
language and the nature of language teaching and learning") to a "method"
which is a plan "no part of which contradicts, and all of which is based upon, the
134 CHRISTOPHER BRUMFIT
selected approach. An approach is axiomatic, a method is procedural". Below

"method" come "techniques", which are the classroom activities that implement
methods: "when visitors view a class, they see mostly techniques" (Anthony
1963: 63-67).
As we shall see below, this way of conceptualising teaching activity has been
criticised. However, before considering in detail problems of definition, it is
worth considering why there is an apparent divergence between practices in
FLT and other areas of the curriculum. If we are to clarify the purposes of re
search in this area, we need to see whether it differs from general education, or
whether general educational research is still relevant to FLT.
There seem to be two ways of thinking about the concept "method". One is
to offer something which claims to be the kind of coherent package referred to
in the definitions above, and to market this, perhaps as part of a commercial en
terprise, associated for example with a particular language school. Thus we read
that "The Berlitz Method is an imitation of the natural process by which a child
learns its mother tongue" (Berlitz 1907: 3, extracted in Hesse 1975: 315). Here,
a teaching procedure is being marketed as a package, and coherence and clarity
of purpose are obvious commercial assets.
Alternatively, our retrospective observations of language teaching may en
able us to see coherence in the history of language teaching, so that historians
characterise the practices of the past, and reformers characterise the failures of
their predecessors, by making generalisations that are recognised as valid by
readers. Howatt (1984: 131) claims that the "grammar-translation method" was
named by its opponents, and the "direct method" clearly emerges from various
traditions, including the reform movement in Europe and the work of Sauveur
and Berlitz in the United States. These uses of the term "method" are closer to
characterisations of historical movements ("the age of reason", "the romantic
period", "the jazz age") than to the coherent packages of the other tradition.
They are also acknowledged to be approximations in the same way as the names
of historical periods are acknowledged to be.
2 Methods as packages
Education has never been free from polemic, and polemic leads to oversim
plification. Nonetheless, it does seem worthwhile to try to disentangle the basic
principles that are at stake from the fights for franchise or ownership of "new"
procedures and their associated polemic, especially as (as Howatt 1984 shows
clearly) there are only a limited number of basic themes to be drawn upon by
teaching methodologists. Viewing the current scene from a British perspective, I
PROBLEMS IN DEFINING INSTRUCTIONAL METHODOLOGIES 135
find it curious that (for example) it was possible for Krashen and Terrell to mar
ket "The Natural Method" (1983) as some kind of coherent package without
constantly examining the extent to which it overlapped with other traditions in
its recommendations. There are, indeed, serious academic problems about the
notion of "packages", but I have addressed this issue elsewhere, in Brumfit
(1985: 86-93), and shall not repeat those arguments here.
Underlying these different views of "methods" there seem to be two separ
ate traditions, and it is worth disentangling them.
1. Language teaching has a long history central to the institutionalised educa
tional process. (Indeed, a recent lengthly encyclopedia article on the history of
teaching methods (Connell 1987) devotes most of its space to the place of lan
guage work in the curriculum). This tradition was diverted, but did not die, with
the decline of the classics, and is to be found in the general curricular discus
sions, and the rationales provided in teacher education, throughout Europe.
2. At the same time, there is a stronger tradition of alternative pedagogies in
language teaching than in other major curricular areas. This is partly because
there is a substantially greater amateur demand for language teaching than there
is for mathematics or other areas of the curriculum: all sorts of people, for prac
tical rather than academic reasons, need languages at some stage in their care
ers, and have done throughout history. It is partly also because alternative
pedagogies in other subject areas were more likely to be repressed — theology
and medicine, to name but two areas, have often been intolerant of alternative
approaches to their fields. Consequently, language teaching has been particulary
open to the claims of inspired outsiders which may or may not have generalis-
able value. Furthermore, many non-academic experts learned languages with
great flair, so that bizarre methods that "worked" for individuals could always be
supported by individual testimonies (e.g. Rambert 1972: 45, working with a text
translated into French words with English word order, "It was a simple, but bril
liant system. I was interested, and learnt it all by heart in no time").
The consequence of this is that discussion of foreign language teaching uses

the concept of a packaged "method" more readily than other subject areas do,
where teaching is generally conceived of more as the development of an accu
mulated professional wisdom, subject to periodic paradigm shifts, than as a num
ber of discrete "methods" competing for control. For general educators, the
areas we call "method" are part of "the culture of teaching". This has been the
subject of much research, and it may be beneficial to consider some of the diffi
culties that such studies have encountered in relation to our experience with lan
guage teaching methods. A recent discussion of research problems in this area
raises the following issues:
136 CHRISTOPHER BRUMFTT
"Three methodological problems have special significance for research on the

cultures of teaching. First, the focus on culture implies inferences about
knowledge, values, and norms for action, none of which can be directly ob
served. Second, the existence of many teaching cultures raises difficult ques
tions: Which culture or cultures does a study address? How can differences
among cultures and similarities within cultures be documented? Third, re
searchers must neither evaluate a culture by inappropriate external standards,
nor fall into the relativistic trap of asserting that every aspect of that culture is
good. Judgement is unavoidable in research on the cultures of teaching, where
pragmatic questions about directions for change are always in the minds of re
searchers and policymakers" (Feimen-Nemser and Floden 1987: 506).
Such concerns are clearly relevant to our interests. Let us simply note, for
the moment, that many discussions of "methods" (e.g. Richards 1984) treat them
as entities in their own right, almost as experimental models, rather than as cul
tures socially emerging from human practices. To what extent is such a formal
view justified?
3 Defining current Methods
There have been several recent discussions of language teaching methods,

and there appears to be considerable agreement over which methods are cur
rently interesting to discuss. Richards and Rodgers (1986) and Larsen-Freeman
(1986) both describe Audiolingualism, Communicative Language Teaching,
Total Physical Response, The Silent Way, Community Language Learning and
Suggestopedia. Larsen-Freeman also discusses Grammar-Translation and the
Direct Method (which are the two major elements in Richards and Rodgers" in
troductory historical survey), and Richards and Rodgers discuss the Situational
Approach, and the Natural Approach also. Older methods textbooks (for
example the first edition of Rivers" standard introduction of 1968) also follow
this pattern in their historical surveys, perhaps with additional discussion of the
Reading Method.
Six of the methods mentioned above have emerged since 1960: two of the
three major predecessors have histories of more than a century. This plethora of
recent methods considered worthy of attention may be attributed to a number of
possible causes. The massive spread of English teaching in the past thirty years
may have brought us all in contact with more and more varied approaches, and
the buoyancy of the market demand has probably led promoters to advertise
their products more aggressively. At the same time, the English language teach
ing profession has become more integrated, as associations of teachers such as
TESOL and IATEFL, of academics in applied linguistics such as AILA and

BAAL, and professional validating bodies (ARELS) and information dissemina
tion bodies (CILT) have arisen in English-speaking countries. All of these were
founded in the 1960's.
In addition, the apparent coherence and confidence of the dominant 1960's
teaching model, audiolingualism, coupled with the widespread and successful
marketing of its procedures and the accompanying expensive hardware, led to a
variety of attempts to produce alternative models, as its deficiencies became
more apparent to practicing teachers. But its very strength — its alleged marriage
of scholarship in psychology and linguistics — caused opponents to look for
equally potent theoretical justifications for the next paradigm. Its successor had
to be of comparable power in order both to counteract audiolingualism's most
confident, and therefore most dangerous tendencies, and to displace it in tea
chers' minds as the dominant paradigm. For the institutional and market reasons
outlined above, the money, and the political power, lay with English, so that
much of the discussion sprang from English sources, even when the methods
being advocated had originally, as with Suggestopedia, been associated with
quite other academic or pedagogical traditions.
This concern with many methods, then, is a relatively recent phenomenon.
The crucial question for research, however, is whether we are suffering more
from terminological confusion than from a wide variety of significantly different
methods. Richards and Rodgers, in spite of their determination to clarify An
thony's original discussion, constantly jump across levels of generality ("This ap
proach to foreign language teaching became known as the
Grammar-Translation Method" 1986: 3), and their first chapter's discussion of
Grammar-Translation is a good illustration of a general principle — that stark
presentation of a retrospectively-perceived phenomenon results in caricature.
Clearly, any method needs to be seen in its historical context if it is to be under
stood fully and there is a great danger in simply caricaturing methods of the past,
and setting them against sympathetically presented contemporary proposals for
current language teaching.
Richards and Rodgers criticise Anthony for not showing how Approaches
may relate to Methods, nor Methods to Techniques. They summarise their
amendments (1986: 28) with "Approach" defined much as Anthony does; "De
sign" requires specific and general objectives, a syllabus model, types of learning
and teaching activities, and specified roles for learners, teachers, and instruc
tional materials; "Procedure" is concerned with classroom techniques, practices
and behaviours based on observation. However, there is some confusion over
the extent to which there is any necessary relationship between the elements in
this model. As Anthony (1963) suggests, any single approach may lie behind a
number of possible methods, and it is also likely that techniques may be in prin
ciple separable from any method: particular methods are more likely to be
identified with constellations of techniques rather than with particular ones ex
clusively.
As an illustration of this, consider the characteristics of Communicative
Language Teaching summarised from a range of sources in a contemporary sur
vey:
1. A focus on the needs of learners, and attempts to define their needs;

2. An emphasis on the content of the activity, rather than on overt language
learning;
3. A tendency to specify syllabuses in terms of meaning ("notional" or "se
mantic" syllabuses) or speech acts ("functional syllabuses");
4. Encouragement and tolerance of language variation in the classroom, even
to the extent of mixing mother tongue and target language use;
5. Individualised work;
6. Errors tolerated as a natural part of the process of language acquisition;
7. A supportive environment, to encourage guilt-free participation; a reduc
tion or suspension of the teacher's judgemental role;
8. Use of techniques which encourage student participation in natural envi
ronments — group- and pair-work, simulations, information-gap exercises;
9. Presentation of language items in contexts of typical use rather than in iso
lation;
10. Materials which are either "authentic" (i.e. not originally intended for lan
guage teaching at all), or which simulate authenticity;
11. For much if not all of the time, a lack of prediction by the teacher of exactly
what language is to be used by learners, because they will be engaged in simu
lated "natural" language activity — whether reading, listening, conversing, or
writing.
(Introduction to ARAL 1987.)
These characteristics reflect a range of improvements felt by teachers to be

intuitively necessary, arising out of a combination of general discussion (incluid-
ing linguistic and applied linguistic discussion) and dissatisfaction with the de
tails of their own current practice. However, few if any of these elements are
new; it is the combination that is new, together with the justifications adduced
for the reuse in new combinations of traditional practices. And no one teacher is
likely to reflect the whole of this package. Furthermore, teachers may well
orientate themselves towards such justifications (at the level of Approach), and
change current practices very little, or be prepared to combine drilling (say)

from another "method" with the "natural" techniques referred to in 8. above.
4 The need for more general categories
It is at least arguable that the use of the term "method" obscures as much as
it reveals. It is difficult to see that the requirements of Anthony's sketch, or of
Richards and Rodgers' more developed outline, are actually addressing ques
tions significantly different from those with which a commentator like Clark is
concerned, although he claims only to deal with educational value-systems in
curriculum renewal (Clark 1987: 3). Citing Skilbeck (1982), Clark identifies
three value systems — classical humanism, reconstructionism, and progressiv-
ism — and relates them to foreign language teaching. The first is realised through
Grammar-Translation, the second through a variety of procedures including
audiolingualism, functional-notional syllabuses and graded objectives, and the
third through a number of process-oriented approaches.
This discussion does not use the term "method", but may be seen as dealing
with very similar issues to those addressed by Richards and Rodgers, without
being committed to the notion that methods come in discrete packages that are
readily identifiable, and that can be chosen from a set of conveniently available
options.
At a less abstract level, I have attempted to define the major features of
classroom planning and organisation (summarised in Brumfit 1984: 95-96). This
concentrates on three types of analysis of product (linguistic, interactional and
content or topic analyses). Any piece of genuine linguistic data will be capable of
being analysed in terms of all three, of course, but particular teaching pro
grammess will tend to concentrate on some dimensions rather than others. That
is to say that teachers will see some of these features as crucial for learning, even
though the goal of learning will inevitably include all of them. However, the
major criterial elements in the classroom process are more important. These are
identified in terms of (i) Communicative Abilities ("conversation/discussion
comprehension, extended writing, and (possibly) extended speaking" being
preferred as categories to the traditional "four skills" model), (ii) orientation to
wards Accuracy or Fluency, and (iii) pedagogical mode ("Individual, Private In
teractional — i.e. pairs or small groups — and Public Interactional — i.e. whole
class or large groups").
Similarly, the kinds of category systems devised for specific research pur
poses, such as that produced at the University of Stirling to characterise modern
language teaching in Scottish schools (Mitchell, Parkinson and Johnstone 1981),
or those devised at the Modern Language Centre at OISE-COLT and TALOS

(Ullman and Geva 1984) will be as appropriate starting points for research dis
cussion, for their categories, too, derive from a view of the key components of
the language classroom.
But all the attempts that have been listed have lacked substantive discussion
of what is fundamental to language teaching, and what is peripheral. They have
taken classroom activity as a phenomenon, and tried to characterise its features,
but have concentrated on the concrete and specific. The risk in doing this is to
refuse to address the more abstract, and often more obvious, criteria for estab
lishing the essence of language teaching. This is an important issue, for what is
logically and unavoidably necessary for successful language teaching needs to be
distinguished from what is merely contingent, and subject to fashion. The latter
is conventional, and may be negotiated, while the former will consist of those
few elements that are defining of the language teaching/learning process.
I have suggested on various occasions (e.g. Brumfit 1985: 38) that there are
in fact only three fundamental requirements for successful language learning.
These are:
(i) exposure (possibly systematic) to the target language;

(ii) opportunities to use the language (either actively or passively);
(iii) motivation to respond to the two previous requirements.
Without these three, language learning cannot take place, so teachers una
voidably have to take a position on all three. But everything else is a matter of
convention, and conventions are negotiated by all those with interests in the in
stitutions of education: teachers, learners, parents, government, administrators
and others. Further, it is at least arguable that none of these conventions can be
seen to be static, not only because needs of learners vary and the views we have
of language and the world vary as our knowledge improves, but even more be
cause the institution of schooling, and the history of langauge teaching, have
their own dynamic. What is motivating now may not be motivating next year,
simply because next year it is a year older. Language teaching is part of a much
larger system, and the characteristic of a system is that if one element changes,
all the others subtly adjust to accommodate the change. Insofar as language
teaching is part of education, its elements will be subject to change caused by
factors that are totally outside the control of language and language-learning
theory.
5 Language teaching in a social context
If we follow this line of reasoning, we arrive at a view of education which

goes like this:
Schooling, which includes teaching, is a co-operative activity performed by

human beings. Participants in this enterprise are constrained by the micro-social
context within which they operate, so any teaching will have limitations on avail
able options imposed by the nature of classroom. Classrooms, for example, can
mimic reality outside, but they cannot avoid being preparatory to reality: they
can never be ends in themeselves. They are also responsive to social networking
that is based on unequal power relations — a classroom stops being a classroom if
it contains more than a small number of people (usually only one) with authority
over a much larger number. These constraints favour particular kinds of social
interactions, but the negotiation of appropriate interaction that goes on within
them is also subject to the macro-sociological context which will reflect larger
ideologies of the time. What does or does not "work" will depend on how much
freedom rather than control is encouraged, how hospitable to diversity the cur
rent atmosphere is, and so on. Factors of these kinds, mediated through the
views of students, their parents, administrators, politician and others will necess
arily constrain any teaching. It is only within this context that matters of the na
ture of language and the nature of language learning become important. These
kinds of constraint, as van Lier notes (1988: 82), have been addressed to some
extent in relation to bilingual education, but have been scarcely touched by re
searchers into FLT.
Furthermore, since the social network provides its own dynamic, sensitive
teachers may well, in the course of their many centuries experience of teaching,
have explored all possible permutations of language learning behaviour. What
changes, as research into classroom behaviour continues, is not necessarily the
essential structure of teaching method, but our ability to describe and explain
that structure more sensitively. There is no logical or necessary relationship be
tween the findings of research and the behaviour of learners or teachers, any
more than our ability to explain evolution more successfully entails changes in
the behaviour of the animals that are evolving.
Now of course to say that no logical relationship is entailed does not mean
that no relationship is possible. It is not my contention that teachers should take
no notice of research, nor that teaching cannot improve as a result of research.
But the argument does suggest that the concerns of researchers must be to try to
understand something that is a given, not to get mixed up with the claims of
those who wish to make money or name or who simply want to improve the
existing system. The dissatisfactions with present practice of present teachers are
data for the reseacher, but enormous care must be taken to avoid seeing changes
of convention as somehow to be interpreted as changes of principle.
6 Conclusion
So where do we get in defining the characteristics of particular methods?

Insofar as we are concerned with language teaching in education, we must see
our interest as potentially with the whole of language. While individuals may in
some circumstances need to be able to read only, or need limited converstation,
these are not the core model for language learning. In practice, the aim of lan
guage teaching is to enable learners to be able to choose what uses they make of
the target language, in the same way as they choose what uses they make of a
first language. That aim may not be realisable, and realistic goals may only be
able to go part of the way towards that ideal: but, realism is an admission of una
voidable contraints preventing what is intended, not a statement of intention.
In this context, there cannot be major differences of intention between dif
ferent educational systems or methods of teaching about what they are trying to
do — only about what is effectively realisable with the resources and time avail
able. Consequently, there is sense in trying to devise categories that relate to all
conceivable language learning situations, as a means of characterising particular
practices at particular times. What is more difficult is assigning particular meth
ods to particular practices without using a ciruclar argument: you measure only
the characteristics that enabled you to identify the method in the first place.
Other problems with the definition of "method" abound. "Grammar-Transla
tion" refers primarily to two specific techniques used in class; "Audiolingualism"
relates to a theory combining a view of the nature of language structure with a
view of the nature of language learning; "Direct Method" refers to a very
general learning theory and a very general technique. The Communicative Ap
proach accepts much of the Direct Method Learning theory, but is far less rigid
than either Direct Method or Audiolingualism in accepted techniques. Sugges-
topedia and The Silent Way possess unusual technical features (use of baroque
music, or of Cuisinier Rods, for example) that have been taken over by outsiders
without accepting the whole philosophical package that underpins the
"method". The so-called "humanistic" methods include, in The Silent Way, a
highly cognitive and in many ways traditional structural syllabus, and in Com
munity Language Learning a method which expects the syllabus to emerge out
of the topics decided on by the learners. The term "method" as currently used
incorportates a large number of conflicting and ill-defined features.
But the elements within classrooms can clearly be discussed in terms of a
number of key features. It would be perfectly possible to specify the charac
teristics of classroom behaviours in terms of the structure of language presented
to a class, the structure of practice opportunities, and the devices for motivation
of students, for example, to use the three key criteria referred to above. Similar
ly, the features of any of the categories isolated for mention by others could like
wise be listed and quantified, and the Mitchell, Parkinson and Johnstone (1981)
list does this for some major features of language classrooms.
What seems much harder to sort out is whether there would be value in de
manding an advance specification of "method" as such. Probably "method" is
better seen as a retrospectively-perceived constellation of common features
rather than as something that can be identified and predicted in advance. To
predict it in advance would be to reduce the teacher's and pupils' roles as deter
minants of classroom procedures to such an extent that crucial elements of
teaching and learning would almost certainly escape observation.
References
Allen, Patrick and Merrill Swain, eds. 1984. Language Issues and Educational Policies. Exploring
Canada's Multilingual Resources ( = ELT Documents, 119). Oxford: Pergamon.
Annual Review of Applied Linguistics 1987. "Communicative Language Teaching." New York:
Anthony, Edward M. 1963. "Approach, Method, and Technique." ELT 17.63-67.
Berlitz, M.D. 1907. Berlitz Method for Teaching Modern Languages. New York: M.D. Berlitz.
Brumfit, Christopher. 1984. Communicative Methodology in Language Teaching. Cambridge:
Brumfit, Christopher. 1985. Language and Literature Teaching From Practice to Principle. Oxford:
Pergamon.
Clark, John L. 1987. Curriculum Renewal in School Foreign Language Learning. Oxford: Oxford
University Press.
Connell, W.F. 1987. "History of Teaching Methods." Dunkin 1987.201-214.
Dunkin, Michael J., ed. 1987. International Encyclopedia of Teaching and Teacher Education.
Oxford: Pergamon.
Dunkin, Michael J. and Bruce J. Biddle. 1974. The Study of Teaching. New York: Holt, Rinehart
and Winston.
Feiman-Nemser, Sharon and Robert E. Floden. 1986. "The Cultures of Teaching." Wittrock
1986.505-526.
Gordon, Peter and Denis Lawton. 1984. A Guide to English Educational Terms. London:
Batsford.
Hesse, M.G., ed. 1915. Approaches to Teaching Foreign Languages. Amsterdam: North-Holland.
Horton, T. and P. Raggatt, eds. 1982. Challenge and Change in the Curriculum. London: Hodder
and Stoughton and Open University.
Howatt, A.P.R. 1984.A History of English Language Teaching. Oxford: Oxford University Press.
Krashen, S. and T. Terrell. 1983. The Natural Approach. Oxford: Pergamon.
Larsen-Freeman, Diane. 1986. Techniques and Principles in Language Teaching. Oxford: Oxford
University Press.
Mitchell, Rosamond, Brain Parkinson and Richard Johnstone. 1981. The Foreign Language Class-
room: an Observational Study. ( = Stirling Educational Monographs, 9). University of Stirling.
Rambert, Marie. 1972. Quicksilver. Basingstoke: Macmillan.
Richards, Jack C. 1984. "The Secret Life of Methods." TESOL Quarterly 18/1.7-23.
Richards, Jack C , John Platt and Heidi Weber. 1985. Longman Dictionary of Applied Linguistics.
Harlow: Longman.
Richards, Jack C. and Theodore S. Rodgers 1986. Approaches and Methods in Language Teaching.
Cambridge: Cambridge University Press.
Rivers, Wilga M. 1968. Teaching Foreign-Language Skills. Chicago: University of Chicago Press.
Skilbeck, M. 1982. "Three Educational Ideologies." Horton and Raggart 1982.
Suppes, Patrick, ed. 1978. Impact of Research on Education. Washington, DC: National Academy
of Education.
Van Lier, Leo. 1988. The Classroom and the Language Learner. Harlow: Longman.
Ullman, Rebecca and Esther Geva. 1984. "Approaches to Observation in Second Language
Classes." Allen and Swain 1984.113-128.
Wittrock, Merlin C , ed. 1986. Handbook of Research on Teaching Third Edition. New York:
Macmillan.
Evaluation of Foreign Language Teaching Projects
and Programmes
Rosamond Mitchell
1 Evaluation and Educational Research
The story of programme evaluation in mainstream educational research

over the last 20 years or so has been one of massive expansion, but also of pro
found reexamination of goals and methods. Early requirements for programme
evaluation, stimulated in the US by massive federal interventionism, were large
ly met by psychologists trained in an experimental, Fisherian research tradition
(see Atkin and House 1981). The first notable public challenge to this model in
its evaluation applications came from Stake, arguing for a broad descriptive ap
proach producing accounts of programmes in context:
"The purpose of educational evaluation is expository: to acquaint the audi

ence with the workings of certain educators and their learners. It differs from
educational research in its orientation to specific program rather than to vari
ables common to many programs. A full evaluation results in a story, sup
ported perhaps by statistics and profiles. It tells what happened. It reveals
perceptions and judgements that different groups and individuals hold—ob
tained, I hope, by objective means. It tells of merit and shortcomings. As a
bonus, it may offer generalisations (The moral of the story is...') for the guid
ance of subsequent educational programs" (Stake 1967a: 5).
Subsequently, the case for broadening the scope of evaluation inquiry be
yond the experimental paradigm has been argued by numerous theorists in the
Anglo-American research community, among whom the most notable are per-
146 ROSAMOND MITCHELL
haps House and Cronbach (US) and MacDonald (UK): see for example, House
(1980), Cronbach et al. (1980), and MacDonald and Walker (1974). (Cronbach
(1982: 324) in particular presents an extended critique of what he calls the "out
moded recommendation that the program evaluator prefer true experiments".)
The grounds for this shift lie essentially in the recognition that evaluation is an
applied, policy-related activity, with a short-term, "improvement" orientation
rather than fundamental research; as Cronbach (1982:2) remarks:
"Many kinds of inquiry and pseudoinquiry are called evaluations. I restrict at
tention to inquiries that represent serious attempts to improve a program or a
kind of service by developing a clear picture of its operations and the fate of
its clients".
It would be wrong to suggest that a consensus exists in the mainstream of

educational research regarding the appropriate scope and procedures for pro
gramme evaluation; however, it is now possible for a leading British practitioner
to argue that it must be seen as "a practical, particularistic, political, persuasive,
educative service" (Simons 1987: 8). Notable continuing (and interrelated) con
cerns within the evaluation community have to do with:
a) The accountability of evaluators. Thus Stake and others argue for "stake
holder" models of evaluation, in which all significant interest groups likely to be
affected by an educational evaluation, professional and non-professional, are
given a role in determining the scope and concerns of the enquiry (see e.g. Stake
1967b).
b) The methodology of evaluation. As in educational research generally, ten
sions persist between those maintaining a defence of quantitative, experimental
and quasi-experimental designs (e.g. Boruch and Cordray 1980, quoted in Cron
bach 1982: 24), those rejecting them in favour of so-called "naturalistic enquiry"
drawing on the research methods of ethnography (e.g. Guba and Lincoln 1985),
and those arguing for an interactionist use of methods from both traditions (e.g.
Cronbach 1982: passim). Closely related to this issue are debates about the de
gree of confidence with which programme outcomes can be considered to be
programme effects, where methods other than true experimental designs are
employed, and the extent to which "reasonable inference" is acceptable.
c) The dissemination of evaluation findings. As Cronbach (1982: 8) remarks:
"Social institutions learn from experience; so do program clients and political

constituencies. The proper function of evaluation is to speed up the learning
process by communicating what might otherwise be overlooked or wrongly
perceived. The evaluator, then, is an educator. His success is to be judged by
his success in communication; that is, by what he leads others to understand
EVALUATION OF FLT PROJECTS AND PROGRAMMES 147
and believe. Payoff comes from the insight that the evaluators work generates
in others.
A study that is technically admirable falls short if what the evaluator learns
does not enter the thinking of the relevant political community".
Whether or not they share Cronbach's view of the "evaluator as educator",

contemporary writers on evaluation concur that active dissemination is criterial
for an evaluation to be judged useful/successful.
d) The relationship of programme evaluation to decision-making and policy
formation. While rational models of programme evaluation have suggested that
"evaluation is the process of delineating, obtaining and providing useful infor
mation for judging decision alternatives" (Stufflebeam et al. 1971: xxv), the his
tory of evaluation experience makes it clear that evaluations may play only a
minor role in direct decision-making regarding the programmes they have stu
died, but make a significant longer-term contribution to future policy and pro
gramme development (Simons 1987: 18-20). Thus, the prime "consumers" of
evaluation reports may be groups in somewhat different specific contexts than
that studied, considering the future development of similar rather than identical
programmes; hence the reconceptualisation of the notion of "generalisation" as
"extrapolation", discussed at length by Cronbach (1982) among others.
The main body of this paper will make continuing reference to these
themes, in discussing more specifically the evaluation of second/foreign lan
guage projects and programmes.
2 Experiment and quasi-experiment in FL/L2 programme evaluation
The Colorado Project (Scherer and Wertheimer 1964) and the Pennsylvania
Project (Smith 1970) are well-known starting points for discussions of FL pro
gramme evaluation (Long 1980, 1984; Beretta 1986a). These studies each at
tempted to compare two FL teaching "methods" (audiolingualism plus a more
"traditional" mode of instruction) using large-scale, field experimental designs
in which classes and their teachers were randomly assigned to either instruc
tional method, with learners' FL achievement as the dependent variable. Their
findings were inconclusive; this politically inconvenient outcome immediately
provoked extensive critiques of the methodology employed by other members of
the FLT research community (see e.g. the October 1969 issue of the Modern
Language Journal, entirely devoted to a series of critiques of the Pennsylvania
Project). However, as Beretta points out, the thrust of these criticisms was "not
for failing to produce an evaluation that was capable of influencing policy, but
for failing to arrange for the tight controls that would have promoted internal
validity and contributed to a theory of language learning" (1988: 4). Thus for
example the Pennsylvania Project was criticised for failing to ensure the two in
structional "treatments" remained distinct (Otto 1969), and for bias in the tests
used (Valette 1969). Some researchers involved in these critiques went on to de
velop models for experimental "comparative methods" research with stronger
internal validity (e.g. Freedman 1976, who substituted pre-recorded instruc
tional sequences for the undependable live teacher variable.) Whatever the me
rits of such designs for fundamental research, as Beretta remarks, they "can have
only extremely remote implications for practice", (1986b: 146) and consequently
can have little role to play in user-oriented programme evaluation.
The literature of the late 1970s and 1980s on the evaluation of second and
foreign language programmes shows strikingly uneven levels of awareness of the
debates within mainstream educational research sketched in the introductory
section of this paper. Reports of substantive evaluations of FL/L2 programmes
are more commonly found in the literature than are discussions of evaluation
methodology. The former may include some explicit rationale for the choice of
evaluation procedures, but for most the rationale remains largely implicit and
must be deduced from the account provided of the evaluator's practice.
Among those who do contribute to the substantive discussion of evaluation
methodology, Richards maintains a strikingly strong commitment to "true ex
perimental design" as the only worthwhile form of programme evaluation
(1984). He commends the small, well-controlled experimental study of Wagner
and Tilney (1983) as an "excellent example" of method evaluation (Richards op.
cit.: 18), and argues that similar principles should be followed in the evaluation
of large scale, long term projects such as the Bangalore "procedural syllabus"
project (Prabhu 1987), which is singled out for special criticism.
Others may feel, however, that the Wagner and Tilney study illustrates well
the problems associated with experimental models in evaluation contexts. The
study concentrated on decontextualised vocabulary acquisition — i.e. one aspect
only of the methodological "package" under investigation. The experimental
"treatment"—vocabulary recited to the sounds of Baroque music — was de
livered via an audiotape, while the control equivalent was delivered by a real
live, "traditional", teacher. (While the behaviour of the latter was well control
led, the subjective attitudes of the teacher towards the experiment, and of
his/her students towards him/her as a person, could not be controlled away.
Neither of course could the individual learning strategies of the students, in and
out of class, randomly assigned though they were). The number of subjects with
in each condition was small, and the population from which they came an un
usual one (why music students?). As it happened, the experiment produced no
significant differences in vocabulary acquisition between the control and ex

perimental groups. From this, the conclusion is drawn that "it remains to be
shown that 'Superlearning' /a version of 'Suggestopedia'/ really is better than an
experienced, successful 'traditional' teacher in a 'traditional' classroom setting"
(Wagner and Tilney 1983:16). This conclusion appears acceptable; but if a signi
ficant difference in vocabulary acquisition had emerged, would onlookers have
been happy to accept the opposite, and move towards the implementation of
"Superlearning" in their classrooms, on the basis of so limited a study? This
seems highly unlikely, for reasons primarily to do with the weak external validity
of the study.
Long (1984) shares Richards' commitment to experimentation as the stron
gest research design available for FL/L2 programme evaluation (and like Ri
chards, does not address the objections to this commitment expressed widely in
the general educational literature). The thrust of his argument concerns the
need to strengthen the internal validity of classroom experimentation by consist
ent monitoring of classroom processes within experimental and control condi
tions. That is, he addresses one of the main design weaknesses identified in
studies such as the Pennsylvania Project (where it is now thought likely the ex
perimental and control methods shaded into each other, but relevant process
data were not collected).
Monitoring the nature and quality of classroom processes is a vital part of
many evaluations of educational programmes (see Parkinson et al. 1982, UU-
mann et al. 1983, and Mitchell et al. 1987, for examples of L2/FL-related studies
which make extensive use of systematic observation techniques for this pur
pose). Such monitoring may have a range of different purposes. It may be under
taken to check the feasibility and degree of implementation of classroom
procedures promoted by the programme; to monitor interactions between new
procedures and old, and detect unintended side-effects; or to analyse resulting
teaching/learning experiences with reference to criteria derived independently
of the project (say, from a learning theory other than that on which the project
may be based). Long, however, reserves the term "Process evaluation" for class-
room observation with a single, normative focus: i.e. monitoring for the main
tenance of key planned differences between "treatments". Only this type of
observation, he claims can "provide explanations for the findings of product
evaluations" (i.e. for those documenting patterns of learner outcomes) (op. cit.:
419).
One serious practical problem with a commitment to true experimental de
signs for the purpose of programme evaluation is of course that in the real world
of education, such designs are frequently unacceptable on political-social
grounds. Thus for example, the French immersion programmes popular in the
schools of English-speaking Canada serve volunteer populations, and any propo

sal for the random assignment of children to such programmes would be strong
ly resisted by parents and community. These politically visible programmes have
in fact attracted one of the most sustained and best-resourced evaluation pro
grammes to be found in connection with L2 learning. It is striking, however, that
these evaluation studies have generally followed in the Fisherian experimental
tradition, using the "next best", quasi-experimental designs, and concentrating
their efforts on monitoring linguistic, academic and attitudinal outcomes. (In
these quasi-experiments, intact volunteer groups receiving the "experimental"
treatment, i.e. immersion, are typically matched with control groups comparable
in age, IQ, socioeconomic status etc., following a range of other programmes
with different mixes of French and English instruction. Typical outcome
measures relate to L1, L2 and other academic achievement, as well as to atti
tudes. See for example, reviews by Swain and Lapkin 1982, and Swain 1984.)
These evaluation studies, taken in their own terms, are clearly vulnerable to
Long's criticisms of the weak internal validity of "product evaluation", given the
general absence of systematic monitoring of implementation. (It is assumed, for
example, that immersion teachers speak nothing but French, delay the introduc
tion of English literacy skills etc.; but generally speaking, no systematic monitor
ing is reported which will reassure the skeptic that this is indeed the case.)
Viewed from a wider evaluation perspective, the absence of more broadly
formulated process questions from the Canadian evaluation agenda is also a
matter of concern. The immersion process claims are strong: "that the same aca
demic content will be covered as in the regular English programme, the only dif
ference between the two programmes being the language of instruction" (Swain
1984: 35). Any consumer of immersion evaluations, perhaps considering the ap
plicability of similar principles in his/her own particular context, will want to
know a great deal about the workings of this "principle" in practice. Is the
mother tongue really excluded from instruction? If not (and we are told in
general terms that students at least may use it), then what particular functions
does it perform? What strategies have the French-using teachers devised, to
present "the same" academic content to non-French-speaking students? Are
there implications, for example, for the pacing of content coverage? For the
quality and nature of class discussion? For the ways pupils can best be grouped?
A host of similar questions arise, in the mind of the interested consumer; yet the
immersion evaluations provide only fragmentary and tantalising glimpses of how
it all works out in practice (see e.g. Lapkin et al. 1983 and Morrison et al. 1984,
for indirect evidence on the use of English in immersion classes).
3 Alternatives to experimental models
Throughout, a few voices have been raised with reference to the Canadian
studies, to argue for the broadening of the quasi-experimental, "product" evalu
ation model to encompass process questions of the kind outlined above. Thus,
Hornby (1980) makes similar suggestions in the context of immersion pro
grammes in the United States. Ullmann and Geva (1985) argue the case in Ca
nada, drawing on their own use of systematic classroom observation in the
evaluation of a "core French" programme to exemplify the approach (Ullmann
et al. 1983). However, it would appear that to date, the impact of these argu
ments has been slight.
In discussing this "gap" in the Canadian immersion evaluation procedures,
Beretta suggests that the researchers have themselves been aware of the "value
of documenting implementation" (1988: 9), but have failed to argue the case
with sponsoring bodies primarily concerned with public reassurance. This is per
haps to underestimate the difficulties of collecting process data, in politically
sensitive contexts (some of which are narrated in Mitchell forthcoming); it may
be that immersion programme developers and/or teachers resisted judgmental
scrutiny of the classroom "black box". This interpretation may be lent credence
by the existence of studies such as that of Canale et al. (1987), who are able to
report sensitive classroom case study material (including, for example, an ac
count of children being held up to public ridicule for mother tongue use) in the
context of an advisory rather than an evaluative document. But the international
evaluation community will benefit if the Canadian researchers themselves can
ultimately produce a full account of the rationales and constraints, academic and
political, which have formed their evaluation agenda.
Because of its scope and international influence, the Canadian French-re
lated evaluation experience has been discussed at some length. The other major
North American L2 evaluation tradition, which has received rather less interna
tional attention, is that associated with federally-funded bilingual education pro
grammes in the United States. Here too the general emphasis has been on
product evaluation inspired by (though not necessarily rigorously implementing)
experimental and quasi-experimental designs (Baker 1981). There has been
however somewhat greater variety in evaluation strategies adopted, with some
critical commentaries on the "product" model and attempts to explore life in
side bilingual programmes using ethnographically-inspired observational
strategies (see for example the introductions to, and empirical studies reported
in Cohen et al. 1979).
Apart from these two substantial, government-funded empirical evaluation
traditions, the universe of FL/L2 programme evaluation is relatively fragmented.
The recent methodological contributions of Beretta (1986a, 1986b and 1988) are
exceptional in their depth of acquaintance with the general evaluation literature;
a few other EFL specialists deal more summarily with the area (e.g. White
1988). The likelihood that language educators will interest themselves in evalu
ation issues has however some relationship with the degree of accountability ex
pected in particular professional contexts. Thus for example, teachers of English
for Special Purposes have a clear concern with evaluation, including an aware
ness of the general evaluation literature unusual among L2/FL specialists, which
they themselves attribute to a strong sense of accountability to their sponsors,
whether governments or commercial organisations (Mackay 1981; McGinley
1986), and/or to their students (Waters 1987). So, Mackay argues for a model of
evaluation which "has as its purpose the provision of those in authority with in
formation which can be used in making decisions about improving or modifying
the program" (1981: 107). For this purpose he stresses the need to complement
student achievement data not only with a range of process information, but also
with a theoretical critique of the programme rationale, and provides a case study
of an actual evaluation which exemplifies these principles:
"The explicit purpose of the appraisal was not so much to assert merely that
the project was satisfactory or unsatisfactory, but to provide those responsible
for its future as detailed an account as possible of every factor which might
contribute to the project's success or lack of success. They would then be in a
position to make decisions, which might affect any aspect of the program, on
the basis of comprehensive and objectively gathered information" (op. cit.:
114).
Where conditions of strong line management and accountability similar to

those encountered in ESP obtain, evaluators show similar concerns and prac
tices even in respect of elementary L2 programmes (see for example an evalu
ation study of "basic English" programmes for US army recruits: Holland et al.
1986). The current "proficiency" movement in FL education in the US also
seems to have stimulated accountability-driven interest in evaluation procedures
among FL professionals (Hagel Jacobson 1982; Lee 1982), and to have gener
ated institutionalised evaluation procedures in many states (e.g. Indiana State
Dept of Public Instruction 1981; Ohio State Dept of Education 1981; Oklahoma
State Dept of Education 1981; and California State Dept of Education 1985).
However, published accounts of actual evaluation studies so far seem few and of
variable quality (see e.g. Barrow 1986; Freed 1987).
The authors of the evaluation proposals and studies surveyed in the last few
paragraphs, while concerned to distinguish evaluation studies from basic re
search, and moving away from the experimental and quasi-experimental models
discussed earlier, retain a commitment to quantitative procedures and assume

the prime audience for evaluation studies is one of decision-makers in authority.
Much rarer in the L2/FL literature are studies in the "naturalistic enquiry" tradi
tion (Guba and Lincoln 1985), which eschew formal "product" measures and
aim to be responsive to the concerns of the wider client community, and not only
to official authority. Such instances as can be found typically have to do with L2
programmes serving minority/disadvantaged communities, such as the Canadian
Indian language teaching programmes investigated by Hebert et al. (1984), or
the ESL programmes for students of Asian origin in Western Canada evaluated
by Barrington (1982 and 1986). Barrington's work in particular strikingly illus
trates the strong identification with the client group likely to ensue from such
approaches to evaluation, and also represents an attempt to present evaluation
findings in a format comprehensible beyond professional circles.
I have no knowledge of professional/authority reactions to these Canadian
studies, but British examples in this tradition have encountered substantial criti
cism on a range of grounds, even from a sympathetic audience. Thus the quality
of the study produced by MacDonald et al. (1982) of a bilingual school in Boston
has been criticised by other UK classroom ethnographers (Atkinson and Dela-
mont 1985). Simons' 1978-9 study of an EEC-sponsored heritage language (Ita
lian/Hindi) teaching project in Bedfordshire, England was, she claims,
suppressed by the responsible local education authority, with the acquiescence
of her sponsoring academic institution, due to her attempt to encompass the ac
tions of high LEA officials within the evaluation study (Simons 1987: 141-169).
While such actions must be deplored, it is nonetheless the case that the range
and quality of empirical data presented in these politically-aware British "natu
ralistic" studies is generally disappointing.
In this paper I have developed a view similar to Beretta's, that experimental
and quasi-experimental research designs are inadequate as models for FL/L2
programme evaluation, even if endowed, as Long suggests, with the added inter
nal validity deriving from the systematic gathering of classroom process informa
tion relating to L2 acquisition theory. If evaluations are to feed into and service
programme decision-making and development, they must inevitably address a
much wider range of questions than can possibly be accommodated within an ex
perimental, hypothesis-testing framework. Evaluation is about inferring the
most likely relationships among a whole network of complex events, from a wide
range of quantitative and qualitative evidence, not about determining strong
causal relationships between small numbers of events. It is about the monitoring
of intended events and their effects, but also about the identification of the un
expected, and the proposal of untried solutions. One of Cronbach's final maxims
is that "evaluation is an art" (1982: 321).
4 The art of answering evaluation questions: a case study
The concluding section of this paper will illustrate some of these points,
with reference to an evaluation study in which the author was recently involved
(Mitchell et al. 1987). The programme to be evaluated was a bilingual (Gaelic-
English) primary school programme in the Western Isles of Scotland. The pro
gramme had already been running for eight years at the time the evaluation was
commissioned. The commission arose out of a conflict between the Western
Isles local education authority and the central Scottish Education Department
regarding the worth of the programme. (The Western Isles had requested fund
ing from the SED for an extension of the bilingual programme into secondary
education, on the basis of the claimed success of the primary programme; the
SED said it was not satisfied as to the latter, and offered to fund the evaluation
study instead. After considerable internal conflict, leading to the resignation of
key programme developers, the local authority accepted the offer and re
searchers from Stirling University, including the author, were invited by the
SED to undertake the study.)
By the time the evaluation study was planned, the bilingual programme had
been extended to all primary schools in the Western Isles. A formal control
group design was thus out of the question, as no "uncontaminated" control
schools were available. However, on the basis of preliminary interviews with
head teacher, it was possible to make a preliminary categorisation of the schools
as having different levels of expressed commitment to the programme. The
evaluation study concentrated mainly on a sample of "high uptake" schools, but
a small number of "low uptake" schools were also included for purposes of less
formal comparison on process and product measures.
The original bilingual education project (BEP) concentrated its efforts
largely on altering the pattern of classroom experience of bilingual children, fa
vouring use of both languages as media of instruction, the integration of lan
guage arts work with other curriculum areas (notably Environmental Studies),
and the adoption of child-centred methods. While the project aimed to develop
children's bilingual competence to a high level, precise language objectives had
not been formulated.
Following the commitments of the original project, the evaluation study
committed the main part of its resources to a study of classroom processes, em
ploying two different systematic observation instruments, unstructured observa
tion, and teacher interviews for the purpose. A second major element in the
evaluation was the assessment of children's writing and speaking skills in both
languages, for selected year groups (Primary 4 and Primary 7). A third, minor
element involved parent interviews, to discover their attitudes towards the pro-
ject and degree of involvement with it (community involvement having been a

key theme in the early stages of the BEP). An investigation of children's lan
guage attitudes had been proposed by the evaluators, but was dropped from the
evaluation plan after opposition from the BEP development group.
The questions of concern to those who sought the evaluation, insofar as they
were articulated, were the following:
- To what extent was the curriculum envisaged by BEP being implemented in
bilingual classrooms?
- To the extent that current practice was in line with BEP objectives, how
confidently could this be attributed to the BEP initiative itself, or were
other factors responsible (e.g. a national trend favouring child-centring and
curriculum integration)?
- Had the BEP brought about measurable gains in bilingual children's langu
age skills?
- Had the education of monoglot (English-speaking) children in schools in
volved in the BEP suffered in any way?
An evaluation study of the kind outlined above could provide its clearest
and most confident answers to the first of these questions. Primarily through the
different structured observational procedures, it proved possible to produce a
rich picture of current classroom activity, indicating a considerable degree of
curriculum integration, and use of both languages as media of instruction across
the curriculum and for a full range of instructional purposes. (Observed dif
ferences between "high" and "low uptake" schools consisted mainly in the much
greater willingness of teachers in the former category to codeswitch, within in
structional episodes. The main overall constraint identified on the use of Gaelic
as a medium of instruction was the teachers' perceptions of differential Gaelic
fluency levels among their pupils. Where individuals were perceived as non-
fluent, teachers addressed them very infrequently in Gaelic; where such individ
uals formed a significant proportion of a class, English predominated in
whole-class instruction also, thus also affecting the overall language experience
of even fully-fluent Gaelic speakers in such classes. As indicated later, these
findings strongly influenced the policy recommendations of the evaluators.)
Child-centring and experiential learning, however, were being implemented to a
much more limited degree, comparable with that found in other studies of con
temporary British primary schooling (an explicit point of comparison was Galton
et al. 1980).
The second question was much less capable of being answered satisfactorily,
given the constraints under which the evaluation study operated (most notably,
the absence of data regarding the state of affairs existing in the schools when
BEP was established, and on relations between the BEP development team and
the schools, during the years prior to the evaluation). However, the question
could be tackled indirectly, partly through indirect teacher reports in interview,
and partly through examination of the classroom process data. The BEP had
concentrated its attention on two particular areas of the curriculum: Gaelic Lan
guage Arts, and Environmental Studies. Comparison of Gaelic Language Arts
work with English Language Arts, never a particular focus of BEP attention,
showed striking differences in teaching methodology. The distinctive aspects of
GLA work as compared with ELA were consistently in line with BEP recom
mendations; thus for example, oral work and discussion, much favoured by BEP,
were common for GLA but very unusual for ELA. Such evidence suggested that
BEP had indeed been the decisive "delivery mechanism" for a range of "pro
gressive" methodological ideas, even though the latter may have been being pro
moted nationally through other mechanisms also.
The third question was the hardest to respond to in any meaningful way,
given the non-experimental design of the evaluation, and our conclusions were
perforce very tentative. The language assessment procedures adopted involved
eliciting extended samples of speech and writing of different types, in both lan
guages, from all P4 and P7 pupils in the 10-school sample. Thus the evaluation
contributed to the domain of public discussion a substantial description of child
ren's bilingual proficiency of a kind not previously available, which allowed for
direct comparisons between levels of achievement in English and Gaelic, at two
different age levels, and also allowed for at least informal comparisons with na
tional levels of achievement in English (as some assessment tasks were derived
directly from those used in a large scale English study: Gorman et al. 1982).
This aspect of the study provided reassurance that levels of achievement in
English were generally satisfactory, and that a majority of children were also
able to communicate effectively in Gaelic, though few pupils were attaining to
equal levels in both languages. But the thorny question remained, regarding the
extent to which the language skills documented could be attributed to the BEP
itself.
Even partially to answer this question, it was judged necessary to treat oral
and literacy skills separately. Children's performance on the Gaelic oral assess
ment tasks correlated very strongly with teachers' independent global ratings of
their Gaelic fluency. All children performed effectively in spoken English, in
cluding those from classrooms where Gaelic was the dominant language of in
struction; however, the older children's English performance surpassed that of
the younger on measures to do with control of longer discourse and sensitivity to
the listener. The older children also outperformed the younger much more sub
stantially in Gaelic, on all measures. While (being older) they had benefited
from longer experience of bilingual schooling, and while school influences were
detectable in some respects (e.g. growth in technical vocabulary), the evaluators
were not convinced that this was the prime reason for their greater oral Gaelic
ability, given the teachers' consistent reports of community language shift and of
decline in Gaelic proficiency among children entering school. It was concluded
that Gaelic use in school was compensating in part only for community decline
in use of the spoken language.
For literacy skills however, correlations between Gaelic test performance
and teacher fluency ratings were much weaker. That is, many children judged by
their teachers to be non- or partially-fluent were able to produce extended if in-
accurate Gaelic writing; this was true at both age levels, though again the older
pupils generally outperformed the younger. Here, the evaluators could attribute
Gaelic achievement to school experience with much greater confidence.
The absence of attitude measures curtailed the answers which could be pro
vided to the fourth question somewhat, but it was nonetheless possible to draw
conclusions from the classroom process data of significance not only for the
monoglot English-speaking children but for the future of the bilingual pro
gramme as a whole. So far from being neglected or alienated from classroom
life, these pupils seemed to be extremely influential, in terms of the language ex
perience available for all pupils. Teachers consistently addressed them individ
ually in English (thus in effect confirming rather than destabilising their
"monoglot" status), and where they were at all numerous, the use of Gaelic in
whole class instruction was significantly restricted. The evaluators viewed this
pattern as ultimately threatening to the viability of the overall programme, and
recommended that a clear policy decision be taken regarding Gaelic L2 instruc
tion for this group.
The above discussion illustrates the kinds of answers, more or less defini
tive, which can be provided for key questions motivating evaluation studies
through a non-experimental but many-faceted design. Lest evaluators develop
too inflated a view of their likely influence, however, this evaluation study also
illustrates the problematic relationship of research-based evaluation short-term
decision-making. Cynics committed to bilingual education viewed the com
missioning of the evaluation as a "stall" on the part of the Scottish political
centre, to avoid fostering the cultural distinctiveness of its highland periphery.
Yet before the evaluation project had reported, the SED purse-strings were
loosed, and substantial sums of earmarked money made available for the pro
motion of Gaelic educational programmes; again, cynics might attribute this to
the general wish of a Tory government to earn itself goodwill in Labour Scot
land, rather than as a considered educational decision. The job satisfaction
evaluators will depend then, not so much on seeing the "right", rational deci-
sions taken in the specific context they have studied, but on feeling they have
operated with sufficient rigour and theoretical grounding to advance general un
derstanding of the overall workings of L2/FL programmes. That is, there must
be a sense in which broad programme evaluations too can claim the status of
"basic" research on the context and dynamics of L2/FL teaching and learning.
References
Atkin, J.M. and E.R. House. 1981. "The federal role in curriculum development, 1950-80." Educa-
tional Evaluation and Policy Analysis 3/5.5-36.
Atkinson, P. and S. Delamont. 1985. "Bread and dreams or bread and circuses? A critique of
'case study' research in education." Controversies in Classroom Research ed. by M. Hammer-
sley, 238-255. Milton Keynes: Open University Press.
Baker, K. A. 1981. Effectiveness of Bilingual Education: A Review of the Literature. Washington,
DC: Department of Education. ED 215 010.
Barrington, G.V. 1982. English as a Second Language. An Evaluation of Calgary Board of Educa-
tion ESL Services Grades 1-12. Summary Report. Calgary, Alberta: Calgary Board of Educa-
tion.
Barrington, G.V. 1986 "Evaluating English as a second language: a naturalistic model" TESL Ca-
nada Journal 3/2.41-51.
Barrow, G.R. 1986. Foreign Language Proficiency in Action. Calumet, IN: Department of Foreign
Languages and Literatures, Purdue University. ED 283 361.
Beretta, A. 1986a. "A case for field-experimentation in program evaluation." Language Learning
36/3.295-309.
Beretta, A. 1986b. "Toward a methodology of ESL program evaluation." TESOL Quarterly
20/1.144-55.
Beretta, A. 1988» "The program evaluator: the ESL researcher without portfolio." Sultan Quaboos
University, Oman. Mimeo.
California State Department of Education. 1985. Handbook for Planning an Effective Foreign Lan-
guage Program. ED 269 993.
Canale, M. et al. 1987. Programme dans les Ecoles Elémentaires de Langue Française pour les
Elèves de Compétence Inégale en Français. Toronto: Ontario Department of Education. ED
281 377.
Cohen, A.D. et al. 1979. Evaluating Evaluation ( = Bilingual Education Series, 6.) Arlington, VA:
Center for Applied Linguistics.
Cronbach, LJ. 1982. Designing Evaluations of Educational and Social Programs. San Francisco:
Jossey-Bass.
Cronbach, LJ. et al. 1980. Toward Reform of Program Evaluation: Aims, Methods and Institutional
Arrangements. San Francisco: Jossey-Bass.
Freed, B.F. 1987. "Preliminary impressions of the effects of a proficiency-based language require-
ment." Foreign Language Annals 20/2.139-46.
Freedman, E.S. 1976. "Experimentation into foreign language teaching methodology." System
4.12-28.
Galton, M. B. Simon, P. Croll, A. Jasmen and J. Willcocks. 1980. Inside the Primary Classroom.
London: Routledge and Kegan Paul.
Gorman, T.P. et al. 1984. Language Performance in Schools: 1982 Primary Survey Report. London:
Department of Education and Science.
Guba, E.G. and Y.S. Lincoln. 1985. Naturalistic Inquiry. Beverly Hills, CA: Sage Publications.
Hagel Jacobson, P.L. 1982. "Using evaluation to improve foreign language education." Modern
Language Journal 66.284-91.
Hebert, Y. et al. 1984. Native Indian Language Education in the Victoria-Saanich Region: An
Evaluation Report. Mimeo. ED 250 341.
Holland, V.M. et al. 1984. English-as-a-Second-Language Programs in Basic Skills Eduction Pro-
gram I. Washington, DC: American Institutes for Research. ED 254 097.
Hornby, P.A. 1980. "Achieving second language fluency through immersion education." Foreign
Language Annals 13/2.107-13.
House, E.R. 1980. Evaluating with Validity. Beverly Hills, CA: Sage Publications.
Indiana State Department of Public Instruction. 1981. Designing Strengthening and Assessing
School FL Programs. ED 222 040.
Lapkin, S. M. Swain, J. Kamin and G. Hanna. 1983. "Late immersion in perspective: the Peel
study." Canadian Modern Language Review 39/2.182-206.
Lee, K.B. 1982. Evaluation of Foreign Language Program in Urban Community. Mimeo. ED 226
588.
Long, M.H. 1980. "Inside the 'black box': methodological issues in classroom research on lan-
guage learning." Language Learning 30/1.1-42.
Long, M.H. 1984. "Process and product in ESL program evaluation." TESOL Quarterly 18/3.409-
25.
MacDonald, B. and R. Walker, eds. 1974. SAFARI I: Innovation, Evaluation, Research and the
Problem of Control. CARE, University of East Anglia.
MacDonald, B. et al. 1982. Bread and Dreams. ( = CARE Occasional Publications, 12.) CARE,
University of East Anglia.
McGinley, K. 1986. "Coming to terms with evaluation." System 14/3.335-41.
Mackay, R. 1981. "Accountability in ESP Programs." ESP Journal 1/2.107-22.
Mitchell, R. et al. 1987. Report of an Independent Evaluation of the Western Isles' Bilingual Educa-
tion Project. Department of Education, University of Stirling.
Mitchell, R. forthcoming. "Evaluating bilingual primary education." Evaluating Language Educa-
tion Programs ed. by A. Beretta and J.C. Alderson. Cambridge: Cambridge University Press.
Morrison, F. 1984. Speaking French in Five-year-old Kindergarten. Ottawa: Ottawa Board of Edu-
cation. ED 259 591.
Ohio State Department of Eduction. 1981. A Self-Appraisal Checklist for Fis in Ohio's Secondary
Schools. ED 206 180.
Oklahoma State Department of Education. 1981. Curriculum Review Handbook: Foreign Lan-
guage. ED 205 051.
Otto, F. 1969. "The teacher in the Pennsylvania Project." Modern Language Journal 53/6.411-420.
Parkinson, B. et al. 1982. An Independent Evaluation of 'Tour de France' ( = Stirling Educational
Monographs, 11.) Department of Education, University of Stirling.
Prabhu, N.S. 1987. Second Language Pedagogy. Oxford: Oxford University Press.
Richards, J.C. 1984. "The secret life of methods." TESOL Quarterly 18/1.7-23.
Scherer, A.C. and M. Wertheimer. 1964. A Psycholinguistic Experiment in Foreign Language
Teaching. New York: McGraw-Hill.
Simons, H. 1987. Getting to Know Schools in a Democracy. Lewes, E. Sussex 7 Philadelphia: Fal-
mer.
Smith, P.D. Jr. 1970. A Comparison of the Cognitive and Audiolingual Approaches to Foreign Lan-
guage Instruction: the Pennsylvania Foreign Language Project. Philadelphia: Center for Cur-
riculum Development.
Stake, R.E. 1967a. "Toward a technology for the evaluation of educational programs." Perspectives
on Curriculum Evaluation ( = AERA Monograph Series on Curriculum Evaluation, 1) ed. by
R.W. Tyler, R.M. Gagne and M. Scriven, 1-12. Chicago: Rand McNally.
Stake, R.E. 1967b. "The countenance of educational evaluation." Teachers College Record
68/7.523-40.
Stufflebeam, D.L., R.L. Hammond, H.O. Merriman, M.M. Provus, W.J. Foley, WJ. Gephart and
G.G. Gupa. 1971. Educational Evaluation and Decision Making. Itasca, IL: Peacock.
Swain, M. 1984. "A review of immersion education in Canada: research and evaluation studies."
Language Issues and Educational Policies ( = ELT Documents, 119) ed. by P. Allen and M.
Swain, 35-51. Oxford: Pergamon/British Council.
Swain, M. and S. Lapkin. 1982. Evaluating Bilingual Education. Clevedon, Avon: Multilingual
Matters.
Ullmann, R. and E. Geva. 1985. "Expanding our evaluation perspective: what can classroom ob-
servation tell us about core French programs?" Canadian Modern Language Review
42/2.307-23.
Ullmann, R. et al. 1983. The York Region Core French Evaluation Project. Toronto: Ontario In-
stitute for Studies in Education.
Valette, R.M. 1969. "The Pennsylvania Project, its conclusions and its implications." Modern Lan-
guage Journal 53/6.396-404.
Wagner, J.J. and G. Tilney. 1983. "The effect of 'superlearning techniques' on the vocabulary ac-
quisition and alpha brainwave production of language learners." TESOL Quarterly 17/1.5-19.
Waters, A. 1987. "Participatory course evaluation in ESP." English for Specific Purposes 6/1.3-12.
White, R.V. 1988. The ELT Curriculum. Oxford: Basil Blackwell.
The Characterization of Teaching and Learning
Environments:
Problems and Perspectives
Dick Allwright
The main purpose of this paper is to argue that the characterization of

teaching and learning environments is something that must still come from re
search, rather than something that we are ready to impose upon it. After a quar-
ter of a century or more of work developing observation systems that purport to
offer valid a priori ways of categorizing behaviour in and out of classrooms (for a
particularly highly developed system see FOCUS, Fanselow 1977; for a histori
cal overview see Allwright 1988), we still do not yet know enough, I suggest, to
be able to specify adequately for research purposes the criterial attributes of
learning environment.
It may even be more appropriate in any case to put forward the argument
that in principle we could never "know enough", because learners' own ways of
perceiving and construing the environments they are in are quite possibly much
more powerful than any externally observable characteristics the learning envi
ronments may present to researchers (see Breen 1985 and forthcoming; see All
wright 1987). If such is the case then it would follow that what we need to study,
as a major priority, is the characterization process that learners themselves are
engaged in.
To develop fully the case against the observation systems so far devised
would require a lengthy survey that would be beyond the scope of this paper,
and it would duplicate discussions already in print (see especially Allwright
1987). It may be more useful here to focus instead on the grosser charac
terizations of learning environments that the field has become accustomed to
using in recent years. The first part of this paper will therefore reconsider the
162 DICK ALLWRIGHT
two most common "traditional" distinctions: firstly that between "informal" and
"formal" contexts, and then that between "second language" and "foreign lan
guage" contexts.
When we have, to take up the subtitle of this paper, reviewed the "prob
lems" inherent in basing research on these gross but familiar and pervasive dis
tinctions, we will then move on to consider the "perspectives" offered by an
alternative view of learning environments: one that focusses on the nature of the
"learning opportunities" that arise in different contexts. This view will be illus
trated from recent doctoral research at Lancaster which itself reinforces the sug
gestion made above that our understanding will be very limited if we do not find
ways of investigating learners' own, probably highly idiosyncratic, processes of
characterizing the learning environments they find themselves in.
First, however, we need to present very briefly the "traditional" distinctions
in our field, consider the purposes that characterizations might be intended to
serve, and, in the light of the obvious interest in using characterizations to inves
tigate the possible "causes" of learning outcomes, review the different types of
learning outcome we need to bear in mind.
1 The traditional distinctions in our field
As already suggested it is now traditional to distinguish between teaching

and learning environments in two major ways. First of all by distinguishing be
tween "informal" and "formal" contexts, and secondly by distinguishing between
"second language" and "foreign language" contexts. These are commonsense di
chotomies that have a certain practical validity for the language teaching profes
sion. They are both problematic from a research standpoint, however, and their
very problematicity may be useful to us as a starting point here, because thinking
about their inadequacies may help us to arrive at more useful ways of charac
terizing teaching and learning environments for our own research purposes.
1.1 Characterizations and purposes
Before we begin to consider the two dichotomies already introduced, how

ever, it may be helpful to stress the point only alluded to above that charac
terizations have no particular value in themselves. To be of any value at all they
must be relevant to people's purposes, and then they can be judged according to
how well they serve those purposes. Our starting point here must be that we are
trying to serve the purposes of people who are researching into foreign or sec-
CHARACTERIZATION OF TEACHING AND LEARNING ENVIRONMENTS 163
ond language learning, but that in itself does not delimit the field adequately,
given the great variety of possible research interests in this area.
Three broad research areas can be discerned. Firstly there is the interest in
theory-building. Secondly there is the interest in developing what might be
called (perhaps unkindly) "piecemeal understanding" or (more positively) "in
sight", rather than any formal theory. And thirdly there is the quite different in
terest involved in providing decision-makers with the descriptive information
they may need in a given pedagogic situation. These three are conceptually dis
tinct, but may well come together in practice.
For present purposes I will assume agreement that theory construction
(whatever we take the word "theory" to mean) is at the heart of the research en
terprise, and that research is essentially about developing our understanding of
whatever phenomena interest us. I also assume agreement that "understanding",
for our purposes, is a matter of becoming less uncertain of the factors that we
can reasonably hold to determine outcomes. (This is clearly a viewpoint that
perpetuates the concern of western science for causes, and as such it is certainly
challengeable, but it probably represents the view of the majority of researchers
in our area.) The phenomena that interest us, presumably, are the processes
whereby a speaker of at least one language becomes a speaker of another lan
guage, or at least moves towards that state (we might also be interested in the
processes whereby a learner attempts, albeit unsuccessfully, to move in that di
rection, and in the processes whereby a teacher might try to motivate reluctant
learners, but these are more likely to be considered peripheral concerns). More
particularly, if we are also educators, we will want to know the extent to which
the result of those processes depends on contextual (and therefore potentially
manipulable) factors. And of course this entails being able to discriminate be
tween those contextual factors (the characteristics of environments) that make a
difference and those that are purely incidental. This can bring us neatly back to
the two commonsense dichotomies we started with. Do they capture factors that
make a difference?
1.2 Different types of "difference"
Unfortunately we cannot usefully consider even that last question without

spending at least some time on the prior question of the sorts of differences we
are interested in. Again I will have to take the reader's agreement for granted
that the following list is relatively unproblematic, since there is not space
enough here to lay out the arguments fully. Five types of difference can be read
ily discerned: rate, ceiling, course, process, and affect.
164 DICK ALLWRIGHT
Rate is of obvious practical significance to educators looking for effective

ness. All other things being equal, rapid progress is bound to be preferred to
slow progress. Even those who are attempting to construct theories to account
for language learning in the most general sense are caught up in this concern for
pace to the extent that their theories have eventually to account for the obvious
fact that learners do differ very considerably in the speed with which they make
progress.
Ceiling is a less obvious category, perhaps, but one of considerable practical
significance if the claims are justified of those who argue that certain types of
learning environment predict difficulty for learners in going beyond a certain
stage in linguistic development (see Higgs and Clifford 1982). Under the head
ing of "fossilization", the topic has also been a major concern of second lan
guage acquisition research from its beginnings as a separate research enterprise
(see Selinker 1972).
Course is of immediate theoretical interest, since it concerns the extent to
which progress in language may be universal rather than idiosyncratic, but it is
also of major practical significance because of the implications of the possibility
of universality. If linguistic progress follows a universal course then it must be
independent of context in this respect — independent, therefore, of the environ
mental differences we are seeking to characterize in this paper. We have to im
agine a world in which pedagogy may be able to affect the rate of progress,
possibly also its ceiling, but not its day by day course. This calls into question the
role of the syllabus, of course, given that traditionally the syllabus has been seen
as a way not only of controlling the order in which items are taught, but ipso
facto, the order in which they are learned.
Process is a less familiar category here, no doubt, but it deserves a place for
itself, I believe, for similar reasons to those adduced for "course", namely that if
there is reason to believe that learners' mental activity is also somehow context-
independent then this entails that the way in which learners are taught does not
determine the way in which they learn. This would have obvious practical impli
cations if we had to accept that teaching methods are in this sense powerless
(see also Allwright 1984a).
Affect is a different sort of difference from the other four, clearly, but an ex
tremely important one from an educational point of view. If we take seriously
the argument that languages are on school curricula because of their potential
value in fostering understanding and communication between peoples (a com
mon enough rationale, surely) then we must be interested in knowing whether
the experience of being an institutional learner of a particular language results
in primarily positive or primarily negative attitudes. We know that the results
are not by any means always positive in this respect. Oiler et al. for example, in
their 1977 study of Mexican workers in the USA, found them becoming less
rather than more positive about the new country, as their linguistic proficiency
developed.
Having reviewed five types of possible difference in learning outcomes we
can now, finally, return to the two dichotomies outlined in the introduction to
this paper and begin to discuss their problematic aspects.
13 Informal versus formal contexts
This commonsense dichotomy attempts essentially to capture the obvious

(but ultimately problematic) point that some learners have teachers and others
do not. The practical validity of the point hardly needs to be argued. Quite clear
ly some people are employed professionally as language teachers, and not every
body who adds a language to their repertoire has access to such a person. This is
often taken, however, as being synonymous with the proposition that some
people are taught while others are not. This would only be true if we were pre
pared to define "teaching" as the exclusive preserve of people officially recog
nised as "teachers". That, I would argue, would be a wholly unsatisfactory
definition of "teaching" for research purposes.
1.4 Defining "teaching"
If, on the other hand, we define "teaching" for research purposes at least, as
a matter of "providing learning opportunities" (see Allwright 1986), then we can
immediately see, I suggest, that people paid professionally to be "teachers" are
not the only possible people to do "teaching". Many people, regardless of their
professional designation, may be in a position to provide learning opportunities,
whether deliberately or purely incidentally as a by-product of some other activ
ity. Such people would of course include other learners in a classroom situation.
This possibility would make it important that any research aimed at measuring
the impact of "teaching" should take into account the extent to which the
"teaching", as now defined, is not purely and simply in the hands of the osten
sible teacher.
With only a small extension of the above thinking we can also see that lear
ners may also be teachers for themselves, individually, in the sense and to the
extent that they create learning opportunities for themselves (and thereby for
each other, of course, in a class situation). This too would need to be taken into
account in any research on classroom language learning.
166 DICK ALLWRIGHT
Is it reasonable, however, to define teaching in this way —simply as the pro

vision of learning opportunities? Certainly it could be offensive to paid profes
sionals to see their painstaking pedagogical work apparently undifferentiated
from a learner's chance encounter with a garrulous native speaker. It might well
seem more reasonable to build in a distinction based on the intentionality of
professional teaching, but a moment's thought should suffice to persuade us that
we cannot take for granted that this intentionality actually makes a difference,
any more than we would wish to take for granted that a learner's intentions are
necessarily operative in determining learning, given that so much learning ap
pears to take place (in all spheres of human lives) out of awareness. Any a priori
differentiation gives the teacher's work a privileged status it may not deserve. It
takes the value of the professional's expertise for granted, of course, and it is vir
tually axiomatic for research in our field that we do well whenever we challenge
whatever has typically been taken for granted within the profession. There is
also research evidence, albeit so far not nearly conclusive, in support of the con
ception that learners may learn from "opportunities" provided by fellow lear
ners rather than from teaching traditionally conceived. In recent research at
Lancaster University, for example, Assia Slimani found that learners were more
likely to claim to have learned items made topical by a fellow learner than items
made topical by their teacher (1987: 267-274; 1989a: 84; 1989b: 226-229). She
also found, interestingly, that items made topical by a fellow learner were more
powerful, by the same measure, than items made topical by oneself (1987: 173-
180; 1989:228).
What all the above suggests is that any a priori characterization of learning
and teaching environments is bound to be suspect, for fundamental research
purposes. An educational decision-maker in a privately-owned language school
may well call for research that compares, for example, learning in classroom
groups and learning in self-access facilities, but whatever differences emerge will
not be properly understood (except by chance) if it is simply assumed that the
operative difference between the two environments is simply the designated dif
ference in administrative patterns — virtually the presence or absence of a class
teacher to determine learner behaviour. We may be reminded at this point of
Long's 1983 survey paper that concludes generally in favour of the proposition
that language instruction makes a positive difference, that it is generally better
than no instruction at all. And yet "instruction" remains undefined, as a given.
The conclusion itself, therefore, remains uninterpretable to the extent that we
still need, as Long himself notes (1983: 380), to know about the potential value
of different types of instruction, and that involves knowing about the relevant
ways in which types of instruction differ from one another. And that brings us to
the position that the characterization of teaching and learning environments is
something that must emerge from research, rather than something that can be
imposed upon research as a framework of independent value. We need the re
search precisely for the purpose of telling us how usefully to characterize teach
ing and learning environments.
With the foregoing admonition in mind we should perhaps turn now to our
second commonsense dichotomy — that between "second language" and "foreign
language" contexts.
1.5 "Second language" and "foreign language" contexts
This distinction is commonly taken to refer to the issue of whether language

teaching and learning take place in a setting where the target language is also
the language of local society, or in a setting where the target language is not a
language generally is use outside the classroom. Such a distinction is of obvious
practical importance to teachers, if, as seems apparent, they need to adjust what
they do inside the classroom according to the possibilities that exist for language
contact outside the classroom. The distinction will not work as a dichotomy, of
course, because of the ease with which it is possible to find situations that do not
conform neatly to either of the two specifications. More obviously important
here, however, is the issue of determining whether or not such differences in set
ting actually "make a difference" to any of the five types of learning outcomes
described above. But the phrase "such differences" begs all the questions. If
"second language setting" and "foreign language setting" are problematic no
tions in themselves then there is no value in trying to use them as if they were
straightforward. Once again, we need research to tell us what the relevant dif
ferences are. And, once again, we can use the notion "learning opportunity" to
help us discuss the problematicity of "second language" and "foreign language"
settings. Essentially what we are talking about, it seems, when considering these
different settings, is a matter of access to learning opportunities.
2 Learning opportunities
First consideration of classroom data suggests that learning opportunities

may be broadly described in two main ways: as "encounter" opportunities and as
"practice" opportunities. "Encounter" opportunities, as their name suggests, are
opportunities to meet whatever is to be learned, while "practice" opportunities,
naturally enough, are opportunities to do something with target material. Two
major comments are necessary at this stage, however. Firstly, the lack of refer-
168 DICK ALLWRIGHT
ence to language in the above characterizations is deliberate, given that the ana
lysis is intended to apply regardless of subject matter. Secondly, and more im
portantly for our purposes here, it is probably not helpful to think in terms of
different types of opportunity. It may be more helpful to think of "encounter"
and "practice" as two ways of looking at any one opportunity. It may well be the
case that they more often occur in combination with each other rather than iso
lated from one another. This perspective also allows us to include affect as a fur
ther aspect of learning opportunities — to comment on the way in which
opportunities might be conducive either to enhanced receptiveness or to en
hanced defensiveness, for example.
2.1 "Encounter"
A more familiar term to introduce discussion in this area would undoubted

ly be "input", but I wish here to develop the notion of encounter considerably
beyond the standard treatments of the topic of input, hence the different termi
nology. What I have in mind is spelled out in a preliminary way in an earlier
paper (Allwright 1984b) in which I differentiate basically between input in the
form of target language material itself, and input that takes the form of guidance
about the target language. This treats as unproblematic the very real issue of
what we mean by the "target language", of course — should we include here
coverage of all the aspects of communicative competence as outlined by Canale
and Swain (1980), for example, and should we also try to cover the literary aims
of many language courses around the world? It also ignores the problem that the
language, however widely conceived, is not necessarily the only target in any
case, given the recent growth of interest in using language classrooms as places
in which to try to help learners also learn something useful about learning itself,
so that they can become more effective language learners (see Wenden and
Rubin 1987).
2.2 "Practice"
"Practice" is now a difficult term to use, because of its associations with be
haviourist approaches to language instruction, but we need some such term to
refer to the mental operations a learner may perform on encountering target
material, and in doing whatever it takes to learn it. Hearing a teacher explain, in
the target language, a particular linguistic concept, offers opportunities to en
counter the explanation and also to encounter the language in which it is ex-
pressed. Beyond that, of course, it also offers opportunities to practice the men
tal operations involved in listening comprehension, whatever we take them to
be, and that in turn may itself constitute an act of learning, if we can accept the
view that comprehending is virtually synonymous with acquiring (see Krashen
1985: 4).
3 Learning opportunities and the SLA mainstream
The foregoing discussion of learning opportunities will perhaps have

prompted the reader to think about recent SLA work on the role of conversa
tional adjustments in classroom language learning. On the face of it, making
conversational adjustments would appear to be a matter of doing things with
learning material, and therefore a matter of "practice opportunities" as outlined
above. The main thrust of the published studies in the area of conversational ad
justments themselves, however, is to focus on conversational adjustments as a
way of refining encounter opportunities so that the language material en
countered becomes comprehensible to the non-native speaker (see, for
example, Doughty and Pica 1986). This is in conformity with a line of argument
that starts with the proposition that comprehensibility is the key, argues from
that that a reasonable measure of comprehensibility will be the number of ad
justments made by a speaker in interaction with a learner, and then researches
how different tasks affect the number of adjustments made, on the assumption
that tasks that generate a greater number of such adjustments are likely to be
more valuable to learners than tasks which are typically less productive in this
respect. Aston (1986) has drawn attention to some of the problems with this line
of reasoning. What I want to do here is simply to express my own concern that it
neglects the practice aspect of the process of making conversational adjust
ments. Learners, not just teachers, do interactive work to make the language
material they encounter comprehensible. We need to know whether this work
can be expected to be of value in itself, as mental activity that contributes direct
ly to learning in some way, or whether it is only of value in that it clarifies, if in
deed it does, the item about which the interactive work is done. In other words,
if a learner asks a question about something the teacher (or anybody else) has
said, should we expect the asking of the question and the subsequent inter
change to be the productive act for that particular learner, or should we expect
the value to lie in the product, the clarified item, rather than in the process?
Assia Slimani's work, as referred to above, suggests that the position is even
more complex in practice, in that it appears that interactive work by learners is
more likely to be helpful to those in a position to overhear it than to those who
170 DICK ALLWRIGHT
actually do the work, whether we look at it in process or in product terms. That

is to say, it appears that the more proficient learners, by doing interactive work,
put on a show which enables some less proficient learners to pick up some lan
guage items, but which does not correlate with progress for the more proficient
learners themselves. There is some evidence that the more proficient learners
are not as disadvantaged as the less proficient (within any one classroom group)
in terms of their ability to profit directly, in product terms, from their own inter
active work (Slimani 1987: 270-271, 1989a: 83).
All these suggestive findings (and they are no more at present) lead me to
propose that research should pay attention to both the "encounter" and the
"practice" aspects of learning opportunities. Beyond that, they also suggest that
research should pay particular attention to learners' proficiency levels relative to
those of others in the same learning group. Teaching and learning environments,
we might say at this point, differ interestingly in terms of the characteristics of
the learning opportunities they provide, and in terms of the proficiency relation
ships they offer learners. We know very little indeed about the impact of profi
ciency relationships on any of our five learning outcomes (although Safya
Cherchalli's 1988 Lancaster -doctoral thesis throws new light on the issue by
means of a diary and interview study as follow-up to a questionnaire survey). We
know more about what are probably the relevant characteristics of learning op
portunities, even though we have generally discussed the issues in different
terms. For example, we can be reasonably sure that research will need to pay at
tention to the source of learning opportunities. The notion of source, however, is
itself a complex one in relation to learning opportunities, because wherever
these are social, interactive events (as they typically are in classroom language
lessons), we are likely to find it very difficult to talk about a single source. For
example, a learner may initiate an enquiry which the teacher responds to. From
one point of view the learner is the source (the originator), but from another
point of view the teacher is clearly the source of the relevant learning material,
no matter who or what prompted its inclusion in the discourse. To complicate
matters even further, we have some reason to believe that the question of who is
the addressee of learning opportunities may also be a relevant factor. For
example, as we saw indicated in Slimani's work (op.cit.), learners seem to find it
less difficult to take something from learning opportunities which are not ad
dressed to themselves in particular but are addressed either to someone else in
the class, or to the whole class. They are nevertheless likely to say, in a question
naire response, that they actually prefer to be the direct addressee (see Lahcen,
in progress). It seems most likely that this phenomenon is closely related to the
issue of relative proficiency, to which we will return below.
4 Idiosyncracy and systematicity
A complication of a different nature, and one with very far-reaching impli

cations for us here, comes from the persuasive suggestion that "each lesson is a
different lesson for each learner". This proposition is certainly supported by the
evidence that what learners get from lessons is highly idiosyncratic (see Slimani
1987: 290-294; 1989a: 84-85). It implies that the external and observable aspects
of learning opportunities are not themselves determinant but have their in
fluence only in interaction with the way in which they are construed by the lear
ners individually. This in turn implies that what learners do essentially amounts
to a characterization, in very much their own terms, of the learning oppor
tunities, and hence the environments, that they find themselves in. And that in
turn implies that we cannot hope to get very far with our understanding of class-
room language learning unless we include in our investigations ways of discover
ing how, and in what terms, learners go about this characterization process.
There might be little point in such an enterprise if learner behaviour was essen
tially random in nature, of course, but "idiosyncratic" behaviour is not necessar
ily "random" behaviour at all. It may well be rule-governed in some important
way, but any learner's choice of which rule or rules to apply in characterizing a
given learning opportunity would be guided by internal as well as by external
factors. These "internal" factors would presumably include both such relatively
stable matters as the learner's personality and also such essentially dynamic mat
ters as the learner's current mental state (both necessarily idiosyncratic in na
ture). Of particular relevance, it seems, in terms of the learner's current mental
state, might be the learner's perception of the amount of processing capacity
available to him or her to perform the tasks that a given learning opportunity
might make in principle possible. This rather abstruse line of reasoning can per
haps best be illustrated from Safya Cherchalli's diary and interview data by the
relatively frequent comments from not very successful learners (Algerian senior
secondary students) to the effect that "We try to understand the words, not the
lessons" (1988: 153). This formulation can be most plausibly interpreted, I be
lieve, as evidence that such learners reclassify lessons as opportunities to learn
isolated words, no matter what the focus is from the teacher's point of view, be
cause they believe that their processing capacity will not enable them to be suc
cessful on any more complex task than a straightforward one of lexical
memorisation. Whether such beliefs about their processing capacity are "objec
tively" justified is not the issue, of course, since what counts is only the individ
ual learner's perception of the state of affairs.
The preceding illustration has opened up the possibility that what appears
to be essentially idiosyncratic behaviour may not only be rule-governed but also
172 DICK ALLWRIGHT
systematically related to identifiable learner characteristics. In this particular

case it would appear that relative proficiency may be the key since, as we have
already noted, it is typically those learners who are relatively less proficient, in a
given classroom group, who tell us that they feel limited to trying to catch the
words, rather than to make sense of the "lesson" as a whole. It does seem only
very remotely possible, however, though clearly well worth investigating, that we
would eventually be able to account for the bulk of learner's idiosyncratic beha
viour in terms of such generalisations.
5 Conclusions
This paper has been an attempt to throw light on some of the issues in
volved in characterising learning and teaching environments. The major points
to have emerged, I believe, are the following.
Firstly, that useful characterizations are necessarily to be seen primarily as
the product of research, rather than as a priori inputs to it.
Secondly, that we cannot yet say with any confidence what the criterial at
tributes of learning and teaching environments are, and therefore cannot yet
characterize them in a way known to be systematically related to learning out
comes, since our research has not advanced that far. In this connection I have il
lustrated the point by developing an alternative definition of "teaching", for
research purposes, and explored some of its potential implications via a "learn
ing opportunities" approach to the analysis of classroom language learning. In
this way I may perhaps appear to be trying to provide the world with yet another
characterization scheme, perhaps eventually to be seen as a rival to Fanselow's
FOCUS (1977), to Ullmann and Geva's TALOS (1983), or to Allen et al.'s
COLT (1984). In self-defence I can only argue that I am well aware of the dan
gers of such an enterprise (these are well set out in Chaudron 1988: 21-22), and
am offering my own approach rather as a complement to those current SLA
studies, in the hope that such a multiplicity of viewpoints will serve not as a set
of increasingly constricting straightjackets but as an encouragement to the
broadening of research attempts to develop our understanding of the complex
ities of classroom language learning.
One further point remains to be made. The foregoing analysis in terms of
learning opportunities has included the introduction of the potentially highly
productive observation that the characterization of learning environments is not
something done only by researchers. This characterization process is rather part
of the normal business of being a learner. As such it may also be crucial to the
process whereby learners get whatever the do get from being in language les-
sons. The natural corollary is that what we need to study, in our research, is the
characterization process itself, among our learners.
We conclude, then, with the proposition that the characterization of learn
ing and teaching environments is far from being merely a preliminary to re
search. It is both an important outcome of research and an important object of
research in its own right, as a process vital to our learners' classroom lives.
References
Allen, J.P.B., M. Frölich and N. Spada. 1984. "The Communicative Orientation of Language
Teaching: An Observation Scheme." Handscombe, Orem and Taylor 1984.231-252.
Allwright, D. 1987. "Classroom Observation: Problems and Possibilities." Das 1987.88-102
Allwright, D. 1988. Observation in the Language Classroom. London: Longman.
Allwright, R.L. 1984a. "The Importance of Interaction in Classroom Language Learning." Applied
Allwright, R.L. 1984b. "Why Don't Learners Learn what Teachers Teach?-The Interaction Hy-
pothesis." Singleton and Little 1984.3-18.
Allwright, R.L. 1986. "Making Sense of Instruction: What's the Problem?" Papers in Applied Lin-
guistics — Michigan 1/2.1-11.
Aston, G. 1986. "Trouble-Shooting in Interaction with Learners: The More the Merrier?" Applied
Breen, M.P. 1985. " The Social Context for Language Learning-A Neglected Situation?" Studies
in Second Language Acquisition 7.135-158.
Breen, M.P. (forthcoming). Understanding the Language Teacher.
Canale, M. and M. Swain. 1980. "Theoretical Bases of Communicative Approaches to Language
Teaching and Testing." Applied Linguistics 1.1-47.
Chaudron, C. 1988. Second Language Classrooms: Research on Teaching and Learning. Cam-
bridge, Cambridge University Press.
Cherchalli, S. 1988. Learners' Reactions to their Textbook (with special Reference to the Relation be-
tween Differential Perceptions and Differential Achievement): A Case Study of Algerian Sec-
ondary School Learners. Lancaster: Doctoral Thesis.
Das, B.K., ed. 1987. Patterns of Classroom Interaction in Southeast Asia ( = Anthology Series, 17.)
Singapore, SEAMEO Regional Language Centre.
Doughty, C. and T. Pica. 1986. "Information Gap Tasks: Do they Facilitate Second Language Ac-
quisition?" TESOL Quarterly 20/2.305-325.
Fanselow, J.F. 1977. "Beyond Rashomon: Conceptualizing and Describing the Teaching Act."
TESOL Quarterly 11/1.17-40.
Handscombe, J., R. Orem and B. Taylor, eds. 1984. ON TESOL '83: The Question of Control.
Washington, DC: TESOL.
Higgs, T.V., ed. 1982. Curriculum, Competence, and the Foreign Language Teacher. Skokie, Illinois:
National Textbook Association.
Higgs, T.V. and R. Clifford. 1982. "The Push Toward Communication." Higgs 1982.57-79.
Krashen, S.D. 1985. The Input Hypothesis: Issues and Implications. London/New York: Longman.
Lahcen, D.B. (In progress.) Attention in Classroom Language Learning. Doctoral Research at the
University of Lancaster.
174 DICK ALLWRIGHT
Long, M.H. 1983. "Does Second Language Instruction Make a Difference? A Review of Re
search." TESOL Quarterly 17/3.359-382.
Meara, P., ed. 1989. Beyond Words ( = British Studies in Applied Linguistics, 4.) London: British
Association for Applied Linguistics.
Oiler, J.W. Jr., L.L. Baca and A. Vigil. 1977. "Attitudes and Attained Proficiency in ESL: A So-
ciolinguistic Study of Mexican Americans in the Southwest." TESOL Quarterly 11/2.173-183.
Selinker, L. 1972. "Interlanguage." International Review ofApplied Linguistics in Language Teach-
ing 10/3.209-231.
Singleton, D.M. and D.G. Little, eds. 1984. Language Learning in Formal and Informal Contexts.
Dublin: Irish Association for Applied Linguistics (IRAAL).
Slimani, A. 1987. The Teaching/Learning Relationship: Learning Opportunities and Learning Out-
comes. An Algerian Case Study. Lancaster: Doctoral Thesis.
Slimani, A. 1989a. "Learning Words from Classroom Discourse." Meara 1989.79-87.
Slimani, A. 1989b. "The Role of Topicalization in Classroom Language Learning." System
17/2.223-234.
Ullmann, R. and E. Geva. 1983. Classroom Observation in the L2 Setting: A Dimension of Program
Evaluation. Ontario: Modern Language Centre, Ontario Institute for Studies in Education.
Wenden, A. and J. Rubin. 1987. Learner Strategies in Language Learning. Englewood Cliffs, NJ:
Prentice/Hall International.
Section IV — Learning Environments
Introduction to the Section Learning Environments
Claire Kramsch
The post structuralist revolution in the language sciences has given ever
more importance to the notion of context and variability in language acquisition
and use. Foreign language research echoes in this respect the general trend in
language pedagogy both in Europe and in the United States. By shifting its at
tention from the structures of language to language learning processes and,
hence, to the person of the learner, research follows the same trend as language
pedagogy, broadening its base from language forms to language use, form the in
dividual learner to his/her interaction with the environment. The notion of "en
vironment," a term that originated in ecology and has now returned to education
after a loop via the computer sciences, is broader than that of context or situ
ation. It evokes global worlds of interconnected networks, "coral gardens" with
their delicate balance of cultures.
Learning environments are defined as either topographically different set
tings (e.g. instructional or natural environments, computer microworlds) or dif
ferent discourse genres in each of these settings (e.g. dialogue, monologue, oral
or written narrative), or different discourse forms within each genre (e.g. in
structional, communicative, procedural, phatic) or different linguistic contexts of
occurrence. They can refer to persons (teachers, peer tutors), materials (tex
tbooks, speech, knowledge in various forms) or circumstances (fortuitous or de
liberate). They are always interactional, in that they elicit or facilitate learning
through interaction with the learner.
The question asked by researchers in this last section is: What kind of learn
ing environments facilitate the acquisition of foreign languages? The first two
178 CLAIRE KRAMSCH
chapters give a response to this question from the two opposite ends of the spec
trum: the (natural) mind of the individual learner and the (electronic) mind of
the computer. Recent advances in linguistic theory allow us to speculate about
adult learners' developmental stages in the acquisition of syntactic structures.
Suzanne Flynn (chapter 12) gives a summary of recent thought in Universal
Grammar theory that accounts for the ability of adults to learn a second lan
guage. If the principles of Universal Grammar are still available to them, all they
have to do is reset the parametric switches of UG principles to fit the L2 par
ameters of the new linguistic environment. At the other end of the cognitive
spectrum, we have the availability of this super-learner/teacher: the computer.
General theories of learning have shed some light on the psychological as
pects of the acquisition of language: cognitive processes, interactional events, re
lationship of language to thought. Thus some progress has been made in our
understanding of learners' cognitive interaction with their environment. The
paradigm mentioned earlier from linear, product-oriented, structural forms of
knowledge, to relational, process-centered, procedural ones, raises some inter
esting epistemological issues, as to how knowledge is represented, transmitted
and internalized. Ralph Ginsberg describes in chapter 13 some of the advances
made in the design of intelligent tutoring systems. These electronic learning en
vironments challenge the basic tenets of traditional language pedagogy.
Between these two extremes, we have nurseries, streets and classrooms. In
chapter 14, Edmondson explores the cognitive and social dimensions of a lear
ner's interaction with teacher and peers in classroom settings and he cautions
against a simplistic, reductionist view of classroom interaction as a mere se
quence of turns-at-talk, even if these surface phenomena are the easiest to re
search.
In the same sense that the term "teaching environment" included, but was
not limited to the person of a teacher, the notion of learning environment im
plies that, although learning cannot take place without mediation, this mediation
can be either direct or indirect. Moreover, since the learner is part of the envi
ronment, we have to think of the relationship between the two as a process of
mutual creation: a learning environment is by definition a context that is not
only conducive to change, but suscpetible to change as well through its interac
tion with the learner. Thus the notion of environment has to be seen as a flex
ible, variable concept, in which learner and the conditions of his/her learning
define each other mutually in a cybernetic sense. This is most true of the learn
ing of culture through language, as Kramsch shows in chapter 15. The develop
ment of cross-cultural competence requires an ecological understanding of
social and cultural environments than can only emerge from the contrastive per
spectives of both the source and the target cultures. The best learning environ-
INTRODUCTION TO THE SECTION LEARNING ENVIRONMENTS 179
ment seems to be the one which allows itself to be deconstructed for precisely
what it is: an environment that allows the learner to eventually dispense of it and
become more than the sum of its parts.
Some Ins and Outs of Foreign Language Classroom
Research
Willis J. Edmondson
1 The Classroom as a Complex Learning Environment
In classroom-based research, we are motivated by both theoretical and

practical concerns. On the one hand, we seek to discover how participants in
language classrooms actually learn, in order to develop more adequate theories
of classroom learning: on the other hand, we are seeking to establish which
classroom events are conducive to more effective learning, such that our re
search findings may be of direct pedagogic relevance. These theoretical and pe
dagogic goals should be mutually reinforcing. Teaching and learning in
classrooms is both the source and the testing-ground of language learning theory
and hypothesis (cf. Long 1985, and the distinction between top-down and bot
tom-up research strategies).
However, while both goals should interact in classroom-based research, the
goal of understanding learning itself is process-oriented, while the goal of estab
lishing conducive classroom conditions for such learning is more product-
oriented. There is then firstly the problem of postulating learning processing on
the basis of observation, and secondly the problem of operationalising a theory
of learning in terms of hypotheses regarding effective teaching behaviours. Both
goals are further beset by problems stemming from the complexity of factors im
pinging on the classroom as a learning environment, in particular the interaction
between learner-internal variables and learner-external variables.
In investigating some of this complexity, I shall adopt an "interactionist" po
sition, in following the hypothesis that the quality of the interaction that takes
182 WILLIS J. EDMONDSON
place in classrooms essentially determines what is learnt there (e.g. Hatch 1978:
403; Allwright 1984). The argument in favour of focussing on the classroom as
interaction is in fact surely self-evident: as Allwright remarks "Interaction is the
process whereby lessons are 'accomplished'..." (Allwright 1984: 159). In other
words, teaching is interaction, and classroom learning occurs in and through in
teraction. A focus on interaction in the classroom has developed logically in
Second Language Acquisition (SLA) studies focussing on the validity of
Krashen's Input Hypothesis (Krashen 1982). Three reasons justify this develop
ment of focus. Firstly, learner outputs also act as input to that learner's own pro
cessing mechanisms (a point made by Sharwood-Smith 1981). Secondly, learner
outputs also act as input for other learners in the same environment, and thirdly
what learners say may clearly determine what will happen next — in the simplest
case, what form further input from a teaching source will take. So, following
such arguments, one can only adequately analyse "input" as a product of class-
room interaction.
The aim of the paper is to expand on this claim, and consider selected the
oretical concepts and research procedures that may be useful in moving closer to
this goal. While I shall be centrally concerned with foreign as opposed to second
language learning, (a blurred distinction, but still a useful one), I shall also be
concerned with some central features of SLA research, assuming that results es
tablished under this rubric should be relevant to foreign language teaching and
learning. Even if this assumption turns out to be optimistic, we may still assume
that an ultimate goal must be a theory of classroom learning that applies to both
types of learning context. The "ins" and "outs" of my title concern then both
learner inputs and learner outputs in classroom interaction, and the problem of
relating learner-internal variables to learner-external factors in seeking to un
derstand classroom learning. The paper is in four sections. Part 1 briefly con
trasts a concern with internal factors. The second part of the paper raises the
vexed "nature versus nurture" issue with regard to internal learner charac
teristics. The third and major part of the paper focusses on classroom interac
tion. A brief summary is offered in conclusion.
2 "Internal" versus "External" factors in Classroom Learning
The complex of factors impinging on concrete learning-teaching events

have been represented in the form of models by for example Dunkin and Biddle
(1974), Strevens (1977: 12-36), Stern (1983: 43-50), Edmondson (1984). Assum
ing it is useful to distinguish between "external" and "internal" factors, as the
dual goals of classroom-based research surely suggest, we should, following such
SOME INS AND OUTS OF FOREIGN LANGUAGE CLASSROOM RESEARCH 183
models, include amongst the former at the very least the following: socio-cultu-
ral setting, educational system, and what observable events actually occur in the
classroom ("input"). One way of characterising different learner-internal factors
is to distinguish between cognitive factors, and affective/personality variables.
The study of cognitive variables has in the past focussed for example on the con
struct "aptitude", while the terms "attitude" and "motivation" reflect a concern
with affective factors deemed to be personality-based.
Further, in this brief attempt to offer a terminological framework for what
follows, I shall use the term "cognitive style" to suggest a specific constellation
of intellectual/cognitive factors, while "learning style" will be used more broad
ly, on the assumption that both cognitive and context-based or affectively-based
internal variables contribute to different learning styles. This terminological
convention may be justified by some research findings. Thus Naiman et al.
(1978) found "field" independence" (a construct concerning cognitive preferen
ces/skills) to correlate highly with their proficiency measures for learning
French, while Hansen and Stansfield (1981) did not find this construct to corre
late with their measures for communicative competence, suggesting in fact that
"a strong interest in other people and attentiveness to social cues in the com
munication task (which are associated with field dependence) perhaps leads to
effective communicative skill" (Hansen and Stansfield 1981: 363). As learning in
classrooms is, centrally, a social activity, and as "interest" and "attentiveness"
are in the classrooms undoubtedly affected by attitudes and motivation, it seems
reasonable to assume that learning style is not exclusively determined by cogni
tive skills.
A largely exclusive concern with either external or internal learning vari
ables is doubtless a function of one's own research interests or background. Fur
ther, either focus can be justified on commonsense grounds. For while it seems
obvious that what observably happens in classrooms determines what is learnt
there, it is equally obvious that ultimately learning takes place, if at all, inside
the heads of those who learn. Again, either focus may well also be strategic: on
the one hand, it is, other things being equal, less difficult to work with observ
ables than with non-observables; on the other hand, it might well be argued that
it is advisable to diagnose differences before recommending treatments.
If we for example go back to roughly the sixties, we discover two research
paradigms reflecting an overriding concern with either external or internal
learning influences, with each largely ignoring the other. On the one hand, much
research around this period consisted of large and small-scale undertakings in
side the method-comparison paradigm, where external factors were controlled
and manipulated: on the other hand, this was also the period when important
work concerning psychological concepts such as a aptitude, attitude and motiva-
tion was carried out. The main thrust of this latter research was aimed at
measuring such internal factors and their relevance to foreign language learning
success.
The former research paradigm was/is implicitly behaviouristic in its psycho
logical presuppositions, and did not strive to differentiate between learners as
individuals, while the latter is clearly cognitive in psychological set, and sought
(amongst other things) precisely such differentiations. The former was class-
room-based and of immediate potential didactic relevance (when research re
vealed which method was superior, it was clearly to be recommended, and
followed), while the latter was to all intents and purposes neither — the practical
relevance was more in terms of screening applicants for special language train
ing options in institutions where such training is based on selection, and in terms
of general educational sensitivity in for example teacher-training.
Research inside the methods comparison paradigm failed to establish that
teaching method was the sole or indeed major determinant of learning success.
Apart from problems of research design, particularly concerning the operation-
alisation of methodological labels such as "audiolingual", the central reason for
this lack of clear results must be the SINGLE FACTOR MYTH. In other words,
the undertaking was simplistic in seeking to isolate one part of the classroom
complex (labelling it "teaching method", and relate learning outcomes to this
single factor (cf. e.g. Stern 1983, chapter 21; Edmondson 1984).
The latter paradigm again focusses on only part of the complex we are con
cerned with, and above all, fails to offer an answer to the practical questions of
language instruction. There are of course continuing doubts as to what, if any
thing, the construct "language aptitude" might be, whether it is distinguishable
from general intelligence, and whether it measures competence, as opposed to
grammatical knowledge (e.g. Oiler 1981; Krashen 1981).
We have of course moved on since this period. Of major importance have
been the impact of sociopragmatic/communicative perspectives, deriving from
sociolinguistics and the philosophy of language, the general "focus on the lear
ner" (which was in part a reaction against the methods comparison tradition),
and the cognitive information-processing framework for understanding learning,
deriving from cognitive science and artificial intelligence. We are however far
from having anything approaching a generally accepted research paradigm, des
pite the establishment of "second language acquisition" as a term and as a focus
of intensive classroom-based research. One reflection of this as I understand its
healthy state of diversity is some uncertainty as to how internal and external fac
tors can be meshed together in an theory, and indeed how we can engage in em
pirical research which does not focus on the one at the expense of the other.
There is still the danger that we repeat the mistakes of the past, using however
"richer" concepts (or simply different ones) developed since then.
3 Are Learner-Internal factors variables over time?
In attempting to reconcile "internal" and "external" factors impinging on

learning, we will naturally need to clarify what we understand "learner-internal"
factors to be, and, critically, how far they are subject to change over time. By
this, I do not mean to address issues of learning age or cognitive maturation, but
the question how far one can learn to be a different type of learner, how far cog
nitive and/or learning style are subject to external influence.
Nativist approaches, in part based on Chomskyan linguistic theory, and in
part on studies claiming regularity in acquisitional sequences, stress biological
predeterminants "inside" the learner whereby the mechanisms for language
learning are universally given, and teaching can merely delay or support this
built-in learning mechanism (this is an oversimplification: more differentiated
theories inside this paradigm allow of individual variation both in terms of tim
ing and in acquisitional sequencing for some language domains — as e.g. in
Pienemann 1985). While it is undoubtedly true that humans are biologically
equipped to learn natural languages, and, further that this equipment sets con
straints of various kinds on for example the forms of hypotheses developed by
learners attempting to come to grips with foreign language samples for which
their interlanguage offers no analysis, an exclusive focus on universal grammar
and alleged developmental sequences cannot be appropriate or adequate for the
concerns of foreign language classroom learning: it mirrors methodologically an
exclusive concern with internal factors such as motivation or aptitude. (This
issue has been thoroughly and at times heatedly debated in Germany of recent
years: the latest state of play is given in Bausch and Königs 1986.)
It seems that the Input Hypothesis is based on a nativist approach. To pa
rody somewhat, the central concern of the teacher is to gently press the trigger,
to follow the Chomskyan metaphor, by making the right type of input available.
Additionally, in Krashen's theory, an Affective Filter hypothesis supplements
the Input hypothesis, seeking to account for the fact that learners in more or less
identical learning environments learn different things at different times to dif
ferent degrees and at different speeds. This filter operates negatively. On this
view, external features of the learning environment thought to be conducive to
learning will in principle suffice to optimalize learning, provided other internal
factors which inhibit this acquisitional processing can be removed or reduced by
appropriate didactic means. Following this Filter model, then, intake constitutes
a subset of the available input, and presumably optimal learning is achieved

when intake mirrors input (although, indeed, it sometimes seems that the two
terms are used more or less interchangeably — as for example in Krashen 1981:
101-102).
However, it is worth stressing that foreign language instruction is most com
monly part of an educational system, and it is essentially the business of educa
tion to change individuals (a point stressed for example by Brumfit — Brumfit
1984: 117). This is not a contradiction of a nativist approach, but a difference of
focus. The notion "learning to learn" is totally consistent with what we know
about the human brain, and suggests that at least some learner-internal cogni
tive abilities and skills are both learnable and teachable. The question is, of
course, which? If on the one hand, we accept that all humans are eminently
equipped for mastering several languages, and not just a single tongue (see for
example Wandruszka 1979), but if on the other hand, we are forced to recognise
that in classrooms at least the ways in which learners differ in their learning
achievements are more startling than the ways in which their learning is similar,
such that hypothesising one universal underlying cognitive mechanism seems
not to provide an adequate account, then the question as to how far teaching can
influence change in learners' cognitive and learning styles is a critical one.
Consider the notion of the "good language learner". Does the good lan
guage learner constitute a norm we should be aiming at, or is he or she the
marked case, due to internal gifts or characteristics not uniformly distributed
amongst the learning population? If the latter is the case, then an attempt to
extrapolate from the case of such learners, by for example providing classroom
conditions deemed conducive to "good learning" will be of dubious validity.
More concretely, the distinction raises the question as to whether didactic
treatments are to be seen as corrective or supportive of differences between
learners. On the one hand, it is commonly suggested for example that the use of
different communicative and learning strategies should be part of teaching pro
grammes, while on the other hand it is also suggested that classroom activities
should be varied or flexible enough to offer learners with different styles scope
for following their individual proclinations (see here the concluding remarks in
Lafayette and Buscaglia 1985). Stasiak (1985, 1988) reports on an extensive
study carried out in the University of Gdansk, focussing on the issues raised by
the distinction between inherent and acquired learner-internal characteristics.
As accounts of this study are perhaps not readily accessible, I shall briefly report
on it.
On the basis of various "intelligence structure" tests (including the I-S-T 70
test of Richard Amthaeur), some 1500 mature subjects were selected for the ex
periment, and placed in four groups of 100. These subjects were then taught one
of four previously unknown languages for one year in small classes. The bases
for these groupings were two: high versus low scorers on the test battery, and
"logicians" versus "verbalists". The latter opposition may be related to -/+ risk-
taking, +/-structure-dependence, serialists vs. globalists, or simply, in Krashen's
sense, natural "learners" versus natural "acquirers". On the argument that rela
tively marked cognitive preferences on this latter dimension would tend to lead
to accuracy without fluency (to use Brumfifs distinction — e.g. Brumfit 1984), or
the opposite, the didactic treatment given in the courses was compensatory. Sim-
plistically, the "learners" were given no rules until they were confident in speak
ing, while the "acquirers" were not allowed to speak until they had learnt the
rules. Terminal testing showed no significant differences between the achieve
ments of the "learners" as opposed to the "acquirers", but a remarkably signifi
cant difference between the highly scoring groups, and the groups with much
lower total scores on the original intelligence structure tests.
The compensatory didactic treatment is premised on the assumption that
learning styles can be changed, that a distinction is possible between the inter-
subject strength of relevant cognitive abilities (high-scorers versus low scorers),
and the relative intra-subject strength of different cognitive skills (the "seria
lists" versus the "globalists"). The former learner characteristics are, on the evi
dence of this research, not affected by didactic treatment, while the latter
apparently are.
The didactic consequence drawn from pre-testing is then stimulating: it sug
gests, in fact, that we should maybe providing learners with what they don't want.
I think this research raises as many issues as it clarifies, possibly because the ac
counts cited are meagre, and a fuller documentation is unfortunately not known
to me. Instead of concluding, as Stasiak does, that this research shows that taking
account of learners' individual cognitive profiles leads to more effective learn
ing, one might for example suggest that highly intelligent persons learn more
successfully under frustrating teaching conditions than do less highly intellec
tually gifted persons. Even so, the didactic consequences drawn from detected
differences between cognitive styles reinforce the simple point I am seeking to
make, namely that we need to clarify our views on posited learner-internal dif
ferences, regarding their universality and the degree to which they are subject to
change via classroom learning experience.
4 Classroom Interaction
The complex interaction inside the learner between internal and external
factors may be linked to classroom interaction by the observation that "The in-
teraction between external and internal factors is manifest in the actual verbal
interactions in which the learner and his interlocutor participate" (Ellis 1985:
129).
If this link is to be fruitfully exploited for research purposes, however, we
need, I want to suggest, a more process-oriented theory of discourse interaction,
and may fruitfully supplement discourse studies of classroom interaction by
studies which attempt to investigate more directly the cognitive mechanisms
underpinning discourse behaviours.
It seems to me that there are two interpretations of the notion of "discourse
interaction" which one can distinguish. One we might call a "weak" interpreta
tion of interaction, the other the "full" interpretation. I posit the distinction in
the belief that often in current classroom-based research only the "weak" inter
pretation is taken into account. We may assume that interaction occurs between
two subcomponents of some complex when A affects or determines B, and B af
fects or determines A (cf. Ellis's formulation of the "interactionist" position on
internal and external learning factors — Ellis 1985:129).
A first, weak, interpretation of the notion of interaction is essentially linear,
whereby interaction occurs over a sequence of time intervals, such that for
example that which affects (interactant A) is active at time 1, the effects occur at
time 2, while the reciprocal relationship may require two further time units (in
discourse, times 2 and 3 are maybe collapsed, as turn-taking occurs). A second,
stronger, interpretation is bilateral, whereby interaction occurs inside one time
unit, inside which A and B are both determining and being determined.
Let me attempt to illustrate the distinction I am trying to make with refer
ence to classroom discourse. The "weak" notion of interaction is simply a reflec
tion of the conventions of turn-taking and sequential relevance in spoken
discourse. The stronger argument is based on the nature of discourse meaning,
and claims that the "meaning" of a discourse contribution produced at time 1
may be subsequently developed or established in the ensuing discourse: what
follows it may have determining retrospective force (Downes 1977; Leech 1980:
79-117; Edmondson 1982). In the context of the classroom, this means, I suggest,
that the very notion of an "input", which is distinct from its discourse conse
quences is open to question: learner responses may determine or affect "input"
not only prospectively but retrospectively. I want to develop this argument a
little, and relate it to the notion of classroom "negotiation".
Without needing to develop a specific theory, we can accept that a discourse
unit has a specifically discoursal "meaning" which is more than its semantic
meaning, more than its sentential meaning. Such meanings do not exist inde
pendently of human processing agents. So when X produces a unit of dis
course—an utterance, let us say — one obvious notion would be that the relevant
discourse meaning is a "speaker meaning" — X in some sense has or seeks to

communicate this meaning. This leads us to a (surely discredited) speech act
theory of discourse. The theory says that the "meaning" of a unit of discourse is
present in the speaker's head prior to enunciation, and thus the hearer's task is
to interpret the resultant utterance in a way consistent with the speakers's com
municative intention (notice how this parallels Krashen's Filter model, referred
to above). This, however, is not how talk (or foreign language learning/teaching)
works, it is a sort of "Lockstep" concept of conversation. It takes no account of
interaction in the full sense, in short no account of the nature of discourse nego
tiation, to which I shall turn presently.
The only other realistic candidate for the achievement of discourse meaning
is clearly the hearer. Given that turn-taking conventionally occurs, and given
that the determination of meaning may require several turns at talk, both (or in
deed all) discourse participants are of course involved in this determination. So
interaction is involved in the determination of the discourse meaning of units of
talk.
A first point which follows, I believe, and which I shall pick up later, is that
as an analysis of classroom discourse is possibly amongst other things — con-
cerned with explicating the meanings arrived at via interaction in that discourse,
then there is inevitably a psychological or cognitive aspect to such explications,
which goes far beyond the assumption that learner responses evidence to a
greater or lesser degree that teacher inputs have been understood. To put the
point here another way, if the explication of coherence is part of the analytic
task in investigating classroom discourse, and if, again, coherence is created by
discourse participants, simply being reflected in the cohesion of what they say,
then an interactional analysis is inevitably concerned with discovering or hypo
thesising the interpretations arrived at by the participants in the course of that
interaction. This will necessarily involve taking different analytic perspectives in
side an analysis, as meanings constructed by different participants may, of
course, not match (see e.g. Hawkins 1985). In such cases, interaction is going on,
and plausibly learning may be going on, too.
A second point concerns the adequacy of the notion of "meaningful" or
"comprehensible" input. The former phrase seems in fact to be ill-chosen, as the
"input" can only become "meaningful" as intake, unless we wish to understand
by the phrase something such as "input composed of sentences of sentence frag
ments whose syntactic structure and lexical elements are within the scope of the
interpretive faculties of the learners's current interlanguage". Clearly, this is not
meant, as such input is plausibly argued to be relevant to learning. As input, as
opposed to intake, it is as it were potentially meaningful. The phrase "com
prehensible input", however, stresses an interactional perspective, as the com-
prehensibility of the input is clearly determined by the processing mechanisms

of the recipient, the learner. The phrase "comprehensible" seems unfortunate,
however, suggesting as it does, an either/or dichotomy, with overtones of what is
implied by "comprehension questions" for which there is only one factually cor
rect answer (cf. Long 1985: 383). Perhaps the term "negotiable input" is not in
appropriate.
This brings me to my third point — our understanding of the term "negotia
tion". In second language acquisition, the term has come to have a specific
meaning, characterised by Ellis (1985: 141), following Long (1983), as proce
dures used for checking on uptake or comprehension — essentially repair se
quences, initiated by either source or recipient of the repairable. Long (1983),
who it should be stressed is concerned with the negotiation of "comprehensible
input", and not explicitly with the negotiation of discourse meaning, also in
cludes native-speaker/teacher behaviours such as speaking slowly, which are
adopted in the likelihood of assisting comprehension on the part of the non-na-
tive speaker/learner. It is, of course, perfectly reasonable to choose to define a
specific term for one's own purposes. Further, it is desirable that the climate in
the classroom is such that learners ask questions concerning that which they do
not understand. Finally, it is an important research issue to determine which fea-
tures of teacher talk are likely to assist understanding. However, such an oper-
ationalisation of the concept of negotiation is impoverished from an
interactional perspective (problems implicit in equating the frequency of repair
sequences with the appropriateness of the resultant discourse for learning are
raised in Aston 1986).
The term negotiation can have at least the following senses:
(a) A discourse outcome is worked towards by participants with different and
incompatible interests or goals. Business deals, for example, may be negotiated
in this way.
(b) Discourse meanings are negotiated in the sense sketched above.
(c) Via explicit repair sequences, the sense of one or more discourse contribu
tions made by one party is clarified.
(d) Via explicit repair sequences, the meaning of one or more expressions used
in a discourse contribution is clarified.
(The distinction between (c) and (d) is reflected in the difference between
the questions "What do you mean?" and "What does X mean?", where X is a
linguistic expression).
These different senses are hierarchical, in that (d) may occur inside (c), (c)
inside (b), and so on, but not vice versa. In other words, sense (d) is at the bot
tom of the hierarchy. Further, it is worth stressing that this hierarchy is not
strictly ordered, in the sense that for example failing to understand the meaning
of expressions used by a speaker does not necessarily inhibit understanding what

he (or she) means.
We appear to require a richer notion of interaction, which, I have sug
gested, will inevitably contain elements of cognitive interpretation. Having sug
gested that such a theory of interaction is of itself of relevance to areas of
current research, I wish finally to explore briefly the implementation of such a
theory in a model for the analysis of classroom discourse. I have in this paper al
ready used the terms "interaction" and "discourse" rather loosely and inter
changeably. In fact, I do not wish to draw a distinction between "interaction
analysis" and "discourse analysis", although this distinction is often made (see
e.g. Ellis 1985, chapter 4 and the references therein). In terms of such a distinc
tion, the former term is associated with the application of finite set of categories
to classroom behaviours (a tradition stemming from Flanders 1970), and the lat
ter term also covers perhaps the application of an ethnomethodological ap
proach to classroom interaction (as, for example, in a concern with turn-taking
conventions, or question-answer sequences — e.g. Riley 1977; Allwright 1980). In
these (opposed) senses, it should, it seems to me, be possible to extend the "dis
course" perspective such that one embraces an "interactional" analysis of class-
room talk, in that all the communicative behaviour taking place there is
systematically analysed, as moves towards mutual understanding.
It seems though that attempts to go further in developing schema for the
analysis of foreign language classrooms have practically disappeared, though we
were inundated with such systems in the late seventies (an interesting exception
to this generalisation is Lörscher 1983). Flanders-like analytical systems have
been justifiably criticised for many reasons, amongst others by Long (1980).
Long points out, for example, that one major problem in applying a predeter
mined list of categories to classroom discourse is where the categories are to
come from. Even when such systems achieve reliability, their validity is open to
question (as is their relevance to the question of effective learning). One
possible answer, which I have sought to develop in a number of papers (e.g. Ed-
mondson 1981b, 1983) is to develop a discourse analytic system for "natural" na
tive-speaker discourse, and then ask how, or indeed whether, the results map
into what is done in language classrooms. It seemed to me in this undertaking
that classroom discourse is both like and unlike non-pedagogic discourse, and,
moreover, often simultaneously. I therefore postulated so-called "co-existent
discourse worlds", a concept which is at the same time a construct inside a dis
course analytic system, and a psycholinguistic hypothesis, if one interprets a "dis
course world" roughly as a cognitive frame of reference (Edmondson 1985,
1987).
The suggestion is then that discourse processing and discourse analysis are
necessarily related issues, and that therefore a richer classroom interactional
analysis system will produce analyses which can be related to questions of lear
ner processing. What would result will be a mode of analysis which could not
easily be reliably applied on a grand scale, generating data for productive quan
titative analysis: indeed, initially case studies might be more useful. Further, the
question of subjectivity in analysis requires attention. The data obtained for de
tailed interactional analysis will be usefully supplemented by research strategies
which attempt to tap learner procedures and perceptions more directly. Verbal
reports, i.e. introspective data of various kinds, collected during or consequent
to, classroom activity, seem to be the most promising means of achieving this (cf.
e.g. Allwright 1984; Hawkins 1985). House (1986) suggests the interesting possi
bility of incorporating retrospective interpretations of previous learning acti
vities into teaching programmes.
5 Summary
I have tried to make the following points regarding the "ins" and "outs" of
foreign language teaching/learning research:
Both internal and external factors co-determine learning success. Under the
former are to be included both universal cognitive abilities whereby humans are
uniquely equipped to acquire a language or languages, and cognitive and affec
tive/emotional traits which distinguish learners in terms of their cognitive and/or
learning "styles". Assuming that foreign language learning is part of education, it
seems sensible to assume that both cognitive and learning styles are subject to
influence, i.e. learnable, until such time as the opposite is firmly proven. In other
words, it is necessary both for learning theory and teaching practice to decide
which individual internal factors are subject to change, and which, if any, are not.
"Single cause" hypotheses, whereby the nature of classroom learning is at
tributed to one (external or internal) factor are priori unlikely to be insightful.
Such hypotheses may, however, be useful, in stimulating research that estab
lishes their inadequacy and thereby contributes towards a more adequate theory.
As I understand it, something like this is happening to the Input hypothesis.
Studies of for example intake, comprehension and output reflect this develop
ment (e.g. Swain 1985; White 1987; Brown 1986).
It is still relevant to take account of native-speaker behaviours, in attempt
ing to understand classroom procedures and learner processing. The point is
made above with regard to discourse analysis and the concept of 'negotiation',
but may hold for example for communicative and/or learning strategies, or the
comprehensibility of different lecture presentations (cf. Long 1985; Chaudron

and Richards 1986). In other words, it would be useful to know in what ways if
any native-speaker behaviours differ.
Terms such as "interaction" or "negotiation" require careful definition and
use. One danger is that we reduce the meaning of a term in order to make it
more manageable, in that we for example operationalise a psycholinguistic con
cept by substituting a behavioural one, we substitute for an internal process an
external product.
There is it seems to me a place for a renewal of concern with characterising
what goes on in classroom interaction from the perspective of discourse analysis,
which will necessarily and explicitly contain concepts and categories of psycho-
linguistic relevance. Such analyses should be supplemented by participant repor
ting data.
We need a type of eclectic pluralism regarding research methodologies and
interests. The notion of classroom-based research should not imply one exclu
sive research paradigm. Indeed the search for such a paradigm can only be pur
sued inside the confines of a theory, and may preclude the discovery or
postulation of other, potentially more fruitful, theories.
Notes
1. I assume here that on any non-trivial interpretation of the term "interaction", reading and
writing activities are also to be viewed as interactive—cf. e.g. Widdowson (1979, chapter 13),
Edmondson (1981, chapter 7). The point is important in a foreign language teaching/learn-
ing context, given the common view that foreign language classrooms have not only a train-
ing function, in terms of inculcating specific language skills, but also an educational function
in the broadest sense (cf. Widdowson P. and U.). Hence, while oral skills are commonly a
major focus of foreign language teaching courses, work within texts, especially in non-begin-
ners' language courses, is of no less importance.
2. Cf. the notion of "world-switching" in classroom discourse (Edmondson 1981b, 1985) or the
teacher encouragement of learner error (Edmondson 1986).
References
Allwright, R.L. 1980. "Turns, Topics and Tasks: Patterns of Participation in Language Learning
and Teaching." Discourse Analysis in Second Language Research ed. by D. Larsen-Freeman,
165-187. Rowley, MA: Newbury House.
Allwright, R.L. 1984. "The Importance of Interaction in Classroom Language Learning." Applied
Aston, G. 1986. "Trouble-shooting in Interaction with Learners: the More the Merrier?" Applied
Bausch, K.R. and F.G. Königs, eds. 1986. Sprachlehrforschung in der Diskussion. Tübingen:
Gunter Narr Verlag.
Brown, G., ed. 1986. Comprehension ( = Applied Linguistics, 7/3.) Oxford: Oxford University
Press.
Brumfit, C.J. 1984. Communicative Methodology in Language Teaching. Cambridge: Cambridge
University Press.
Chaudron, C. and J. Richards. 1986. "The Effect of Discourse Markers on the Comprehension of
Lectures." Applied Linguistics 7.113-127.
Downes, W. 1977. "The Imperative and Pragmatics." Journal of Linguistics 13.77-97.
Dunkin, MJ. and B.J. Biddle. 1974. The Study of Teaching. New York: Holt, Rinehart and Win-
ston.
Edmondson, W.J. 1981a. Spoken Discourse. London: Longman.
Edmondson, W.J. 1981b. "Worlds within Worlds-Problems in the Description of Teacher-Learner
Interaction in the Foreign Language Classroom." Proceedings of the 5th AILA Congress ed.
by J.G. Savard and L. Laforge, 127-140. Quebec: Laval University Press.
Edmondson, WJ. 1982. "On the Determination of Meaning in Discourse." Linguistische Berichte
78.33-42.
Edmondson, WJ. 1983. "Diskurs im Fremdsprachenunterricht als Handlungsgeschehen." Hand-
lungsorientierte Fremdsprachenunterricht ed. by A. Raasch, 39-42. Tübingen: Gunter Narr
Verlag.
Edmondson, W.J. 1984. "Methods, Approaches, Principles and Practices." New Approaches in
Foreign Language Methodology ed. by W. Knibbeler and M. Bernards, 53-62. Brussels:
AIMAV.
Edmondson, W J. 1985. "Discourse Worlds in the Classroom and in Foreign Language Learning."
Studies in Second Language Acquisition 7.159-168.
Edmondson, WJ. 1987. "'Acquisition' and 'Learning': the Discourse System Integration Hypo-
thesis." Perspectives on Language in Performance ( = Festschrift Werner Hüllen) ed. by W.
Lörscher and R. Schulze, 1070-1089. Tübingen: Gunter Narr Verlag.
Ellis, R. 1985. Understanding Second Language Acquisition. Oxford: Oxford University Press.
Flanders, N. 1970. Analysing Teaching Behavior. Reading, MA: Addison-Wesley.
Gass, S.M. and C.G. Madden, eds. 1985. Input in Second Language Acquisition. Rowley, MA:
Newbury House.
Hansen, J. and C. Stansfield. 1981. "The Relationship of Fielddependent-independent Cognitive
Styles to Foreign Language Achievement." Language Learning 31.349-367.
Hatch, E. 1978. "Discourse Analysis and Second Language Acquisition." Second Language Ac-
quisition ed. by E. Hatch, 401-435. Rowley, MA: Newbury House.
Hawkins, B. 1985. "Is an 'Appropriate Response' always so Appropriate?" Gass and Madden
1985.162-178.
House, J. 1986. "Learning to Talk: Talking to Learn. An Investigation of Learner Performance in
Two Types of Discourse." Kasper 1986.43-57.
Kasper, G., ed. 1986. Learning Teaching and Communication in the Foreign Language Classroom.
Aarhus: University Press.
Krashen, S. 1981. Second Language Acquisition and Second Language Learning. Oxford: Perga-
mon.
Krashen, S. 1982. Principles and Practice in Second Language Acquisition. Oxford: Pergamon.
Lafayette, R.C. and M. Buscaglia. 1985. "Students Learn Language via a Civilization Course-A
Comparison of Second Language Classroom Environments." Studies in Second Language
Acquisition 7.323-342.
Leech, G. 1980. Explorations in Semantics and Pragmatics. Amsterdam: John Benjamins.

Long, M. 1980. "Inside the Black Box: Methodological Issues in Classroom Research on Lan
guage Learning." Language Learning 30.1-42.
Long, M. 1983. "Native Speaker/Non-native Speaker Conversation and the Negotiation of Com
prehensible Input." Applied Linguistics 4.126-141.
Long, M. 1985. "From Input to Intake: on argumentation in second language acquisition" Gass
and Madden 1985.377-393.
Lörscher, W. 1983. Linguistische Beschreibung und Analyse von Fremdsprachenunterricht als Dis-
kurs. Tübingen: Gunter Narr Verlag.
Naiman, N., M. Fröhlich, H. Stern and A. Todesco. 1978. The Good Language Learner ( = Re-
search in Education Series, 7.) Toronto: Ontario Institute for Studies in Education.
Oller, J. 1981. "Research on the Measurement of Affective Variables: some Remaining Ques
tions." New Dimension in Second Language Acquisition Research ed. by R.W. Andersen, 14-
27. Rowley, MA: Newbury House.
Pienemann, M. 1985. "Learnability and Syllabus Construction." Modelling and Assessing Second
Language Development ed. by K. Hyltenstam and M. Pienemann, 23-75. San Diego: College-
Hill Press.
Riley, P. 1977. "Discourse Networks in Classroom Interaction: some Problems in Communicative
Language Teaching." Mélanges Pédagogiques. University of Nancy: CRAPEL.
Sharwood-Smith, M. 1981. "Consciousness-Raising and the Second Language Learner." Applied
Stasiak, H. 1985. "Untersuchungen zur Korrelation zwischen glottodidaktischen Begabungen und
anderen Richtungsbegabungen." Zielsprache Deutsch 16.16-20.
Stasiak, H. 1988. "Sprachbarrieren beim Fremdsprachenerwerb — Einfluss der Richtungsbega
bungen." Neusprachliche Mitteilungen 41.26-29.
Stern, H.H. 1983. Fundamental Concepts of Language Teaching. Oxford: Oxford University Press.
Strevens, P. 1977. New Orientations in the Teaching of English. Oxford: Oxford University Press.
Swain, M. 1985. "Communicative Competence: some Roles of Comprehensible Input and Com
prehensible Output in its Development." Gass and Madden 1985.235-253.
Van Lier, L. 1988. The Classroom and the Language Learner. London: Longman.
Wandruszka, M. 1979. Die Mehrsprachigkeit des Menschen. Munich: Piper.
White, L. 1987. "Against Comprehensible Input: The Input Hypothesis and the Development of
Second-language Competence." Applied Linguistics 8.96-110.
Widdowson, H.G. 1979. Explorations in Applied Linguistics. Oxford: Oxford University Press.
Widdowson, H.G. 1983. Language Purpose and Language Use. Oxford: Oxford University Press.
Linguistic Theory and Foreign Language Learning
Environments
Suzanne Flynn
Constructing learning environments that enhance the foreign or second lan

guage learning process necessitates integrating findings drawn from a wide
range of sources. Traditionally, such findings derived principally from one's own
intuitions and experiences about what simply "worked" and on occasion, from
developments isolated by learning theorists.
Now, however, other tools have become available that allow us not only to
confirm our basic intuitions but also to supplement them in certain principled
ways. These tools enable us to deal with the new demands created by expanded
learning environments. One such instrument is linguistic theory. Current work in
linguistic theory raises a number of new issues that could have important impli
cations for the design of effective instructional contexts. For example, at one
level this work sheds light both on the nature of language knowledge and use
and on the role of input in language learning. At another level, this work could
prove important in terms of the development of effective groupings and se
quencings of curricular materials. Interest in these issues is not new. Histori
cally, however, attempts to integrate linguistic theory and language pedagogy
often ended in failure; this in turn resulted in an almost total divorce between
linguistic theory and language pedagogy (see related discussion in Newmeyer
1983; Newmeyer and Weinberger 1988). Such failures were in large part due to
the fact that linguistic theory did not easily allow extensions to language peda
gogy.
In recent years, however, there have been significant developments in the
oretical linguistics and in the psycholinguistic research that derives from such
198 SUZANNE FLYNN
work, especially in the areas of language acquisition research. One important

consequence of these advancements is that they open up possibilities for estab
lishing new connections between linguistic theory and language teaching. We
are now in the position to begin to make meaningful conjectures about possible
linkups between these two domains and to make suggestions about possible pro
grams of research that could empirically test these hypotheses (see also Shar-
wood-Smith 1981; Dulay, Burt and Krashen 1982; Klein 1986; Rutherford 1987;
Cook 1988a, among others for attempts to relate linguistic theory and language
pedagogy).
Isolating and speculating on the potential contributions of linguistic theory
for the language teaching environment is the principal focus of this paper. To do
this, we will first outline one recent perspective on the general nature of lan
guage as well as consider relevant related issues concerning both first and sec
ond language acquisition. Once these preliminaries have been established, we
will then focus on several ways in which linguistic theory might prove relevant
for language teaching concerns.
In this paper, the term second language acquisition is used to refer to both
second and foreign language acquisition. That is, language learning that takes
place in a context in which the target language to be learned is either the princi
pal language spoken in that culture or in a context in which it is not.
In addition, within the framework of this paper, general terms such as in
structional settings, language pedagogy etc., are used to refer to a number of
other more specific domains e.g. teacher preparation, classroom composition,
curricular design, use of technology, to name a few.
1 Background: The Nature of Our Linguistic Knowledge
It is incontrovertible that language is a complex system of interrelated sub

components or levels, each with its own associated set of properties and princi
ples. The basic levels of language consist of the phonology, the morphology, the
syntax, the semantics — including the word storage system (the lexicon) — and the
principles governing the use of language in communicative contexts (the prag
matics).
To become a native, or native-like, speaker of a language, one must acquire
the competence for each of these systems. Such competence is reflected in the
intuitions that speakers have about the well-formedness of items at each level of
linguistic organization.
For example, English speakers agree about the word-potential of different
phonotactic combinations: that 'blip' is an English word, that 'blap' is a potential
LINGUISTIC THEORY AND FL LEARNING ENVIRONMENTS 199
English word and that 'sblap' could not be an English word given the rules of the
language. At the morphological level, they know that 'Mapped' and 'blapping'
are acceptable transformations of the potential verb to 'blap' without having to
know the meaning of the verb. At the syntactic level, English speakers can rec
ognize the difference between the grammatical sentence {John is a teacher) and
the ungrammatical (*John a teacher is). At the semantic level, they recognize
anomalous sentences (!The chair thinks a hole in one) and can identify para
phrases as expressing the same meaning (John wrote the angry rebuttal) and (The
rebuttal was written by John). At the pragmatic level, they can distinguish be
tween polite questions ("Would you please close the window?") and rude ques
tions ("Close the window, huh?") in particular contexts (e.g. requesting the
window closed from one's future employer). A language learner's competence
as a speaker or listener reveals an even more profound knowledge of the
properties of her language — a knowledge that is both complex and abstract.
Consider for example, the complexity of the knowledge that must be repre-
sented in the competence of an English speaker to account for the normal per-
formance in assigning coreference between a reflexive pronoun and a noun. The
indices indicate coreference assignments. An asterisk means that the coref-
erence assignment is not possible.
COREFERENCE
(la) Maryi saw herselfi.
(lb) Maryi saw her*i.
In the example in 1a, speakers of English will agree that coreference is

possible between Mary and herself; that is, they must refer to the same person.
However, despite the grammaticality of the sentence in1b,Mary and her cannot
be construed as the same person; her in this example must refer to someone
else. In order to account for these facts we must appeal to linguistic principles
and properties that make reference to abstract structural configurations under
lying these surface strings. More specifically, we need to differentiate the two
cases in terms of distinct syntactic domains defined over abstract hierarchical
trees in which the reflexive, herself, and the pronoun, her, can operate.2
Similarly, the interactions that hold between any two of these subcompo
nents are also highly complex and abstract. Consider for example, one interac
tion between syntax and phonology, namely the wanna contraction that occurs in
colloquial American English. This is illustrated in 2.
200 SUZANNE FLYNN
WANNA CONTRACTION
(2a) I want to win the race.

(2b) I wanna win the race.
(2c) Who do you want to visit t?3

(2d) Who do you wanna visit?
(2e) Who do you want t to visit Bill?

(2f) *Who do you wanna visit Bill?
Want to in 2a and 2c can be contracted to wanna as in 2b and 2d. Descrip

tively, we can understand this process by noting that the position from which the
wh word (who) has been moved in 2e blocks the contraction in 2f. In other
words, while there is no phonological realization of the site from which the wh
word has been moved, it is nonetheless "real" and computed by speakers of
English.
As native speakers of any language we all "know" these facts and many
more. Most intriguing is that we all "learned" these complexities quite rapidly
and in the face of what appears to be quite limited exposure to our language.
2 Focus of current linguistic endeavors
2.1 Universal Grammar
An important focus of contemporary linguistics is to develop an account of

these facts as a function of an interacting system of modules or subsystems. In
fact, the most explicit theory of the human competence for language and its ac
quisition has been proposed by Chomsky in the form of a generative theory of
Universal Grammar (UG).
A central development in this work is its shift away from the concept of lan
guage as a system of rules to a view of it as a function of fundamental principles
that interact with a set of parameters, the setting of which has rich deductive
consequences for a particular grammar. Taken together these two components
allow us to account for both the shared universal properties of language and the
differences observed among languages. To illustrate, as shown in 3, languages
can differ with respect to the setting of the head-direction parameter (Stowell
1981).Languages can be head-initial as shown in 3a for English and Spanish; or
they can be head-final as shown in 3b for Japanese. In head-initial languages,

heads in phrasal categories precede their complements; for example, in 3a the
head in the noun phrase(NP) is the "the child", and this NP precedes its comple-
ment relative clause, "who is eating rice". In contrast, in head-final languages,
the complement precedes the head as illustrated in the NP shown for Japanese in
3b.
HEAD-DIRECTION PARAMETER
(3a) Head-Initial
English
[The child [who is eating rice]] is crying.
Spanish
[El niño [que come arroz]] llora.
(3b) Head-Final
Japanese
[[Gohan-o tabete-iru] ko-ga] naite-imasu.
'Rice-obj. eating is child-subj. crying is.'
Once a child establishes that her language is either head-initial or head-

final, a number of deductive consequences follow from this, namely that this
head-direction should hold for all other major phrasal categories as well. The
exact representation of these principles and parameters is a source of extensive
debate among linguists (see for example Chomsky 1986; Cook 1988b; Lasnik
and Uriagereka 1988; Radford 1988 for discussion). Nonetheless, there is wide
spread agreement that the faculty that underlies the speaker's linguistic knowl
edge is discrete from other types of knowledge humans possess about the world
around them. In addition, any explanation of a speaker's linguistic competence
ultimately must include a specification of the ways in which this domain-specific
knowledge interacts with other mental processes of perception, memory etc.
However, this issue is well beyond the scope of contemporary linguistic inquiry.
2.2 Language Acquisition
UG also proposes a very strong theory of acquisition. As such, UG "pro

vides a sensory system for the preliminary analysis of linguistic data and a sche
matism that determines quite narrowly a certain class of grammars" (Chomsky
202 SUZANNE FLYNN
1975: 12). Within this context, "knowledge of grammar, hence of language, de
velops in the child through the interplay of genetically determined principles
and a course of experience" (Chomsky 1980:134).
Informally, we speak of this process as language learning. The mediation of
UG in language learning restricts the infinite number of false leads that could be
provided by random induction from unguided experience of surface structure
data alone (Lust 1986). As a theory of acquisition, UG makes several predic
tions.
For example, one prediction is that learners' hypotheses about language are
structure dependent; that is, "early hypotheses about possible grammatical com
ponents are defined on sentences of words analyzed into abstract phrases"
(Chomsky 1975: 32). This means that learners naturally abstract out from what
they hear and organize the language, for example, a sentence, into hierarchies of
phrasal units. In this sense, UG restricts the nature of the hypotheses learners
will consider about the target language they are learning.
More specifically, UG predicts that the relevant properties learners attend
to in acquisition are those isolated by the principles and parameters approach of
UG. For example, if some version of the UG formulation is correct, we should
find evidence that learners will know that languages will instantiate some type of
a head-complement ordering. At the same time, we should find evidence that
they are attempting to establish the correct head-direction for the language they
are learning in acquisition. The theory also predicts that within this context, the
speech environment to which the learner is exposed plays an important but
limited role in acquisition. Its principal function is to specify those ways in which
the open parameters of UG are instantiated in a particular language. That is, the
environment provides the data base necessary for the learner to establish the
values of the parameters associated with UG in order to construct the grammar
of a particular language. The role of the environment in this framework repre
sents a major departure from traditional behaviorist models in which the envi
ronment provides everything that is needed for language learning.
Current research in theoretical first language acquisition seeks to document
the role of UG in the language learning process (see work represented in for
example Lust 1986; Roeper and Williams 1987).
In summary, as a theory of language, UG provides a system of principles
and parameters which of necessity constitute the properties of all languages. As
a theory of biological endowment for language, UG provides an early schemat
ism that learners apply to languages. This schematism in turn significantly con
strains the nature and range of hypotheses learners will entertain when
acquiring a new language.
2.3 Second Language Acquisition
More recently, work has been initiated within a UG framework in second

language acquisition (see, for example, work represented in Flynn and O'Neil
1988). Consistent with the predictions of UG for language acquisition, results of
this initial body of research in second language acquisition suggest that adults'
hypotheses about the target second language are structure dependent; that is,
they do not employ strategies which scan the surface structure string of an utter
ance alone but are sensitive to underlying abstract structural configurations of
language. There is also evidence to suggest that principles and parameters of
UG constrain the range of hypotheses second language learners apply to the
learning of the target second language.
In turn, these preliminary findings allow us to begin to reconcile two seem
ingly disparate bodies of data suggested by two earlier approaches to second lan
guage acquisition — namely Contrastive Analysis (CA) (Fries 1945; Lado 1957)
and Creative Construction (CC) (Dulay and Burt 1974). The role of parameters
within UG provides a mechanism to account for the role of the first language ex
perience isolated by CA. The role of principles provides a mechanism to account
for universal properties common to all acquisition processes initially isolated by
a CC theory of second language acquisition.
3 Implications for instructional settings
While there are no definitive answers yet available with respect to either
first or adult second language learning there are a number of issues that the lan
guage acquisition research as well as the theory from which it emerges raise for
the design of effective instructional settings. In particular, both types of research
could improve our understanding about what knowledge is available to the lear
ner, how this knowledge is used and how learning takes place. In turn, these in
sights have consequences for teacher training, classroom composition, as well as
for the development of effective groupings and sequencing of curricular materi
als. We will consider these and others in more detail below.
3.1 What Knowledge is Available to the Learner?
To begin, we know that adult second language learners do not start with
"clean slates". That is, they bring to the language learning context knowledge
not available to the child first language learner. At the same time, we know that
204 SUZANNE FLYNN
adult second language learners also share with children a certain body of com
mon linguistic knowledge.
More specifically, we know that adults have at least three distinct bodies of
knowledge available to them:
1. General linguistic knowledge about principles and parameters of UG. This

is shared with child first language learners.
2. Specific linguistic knowledge of at least one language. This is not shared
with child first language learners.
3. All manner of extra-linguistic knowledge that follows from mature cognitive
development and experience with at least one or more cultures. This knowledge
is not shared with children.
While the existence of either a knowledge base derived from the first lan
guage or one derived from general cognition may not be surprising, the role of
general properties of UG in the adult language learning process may be. The ex
istence of this body of knowledge means that the adult, in contrast to many
traditional approaches, namely CA, and also in contrast to several more recent
approaches e.g. the Fundamental Difference Hypothesis (Bley-Vroman 1989),
second language acquisition is not restricted by the learner's first language alone
or by unconstrained problem solving strategies.
Through their knowledge of UG, adult second language learners are not re
stricted to surface structure facts of a language alone. Their knowledge of UG
involves a capacity that is both complex and abstract. This was briefly illustrated
above in the coreference and wanna examples in 1 and 2. More specifically, lear
ners bring to the language learning context a set of structural sensitivities com
parable to those that they bring to the first language learning situation. That is,
there is evidence that suggests that learners are prepared to pick up the same
abstract structural properties of the second language grammar that they did for
the first language grammar, for example the head-direction of a language (see
related discussion in Martohardjono and Gair 1989).
Knowledge of the first language means that learners have a fully developed
competence for at least one other language. This means that they have con
structed a specific grammar from the principles and parameters provided by a
theory of UG. More specifically, open parameters have been specified for par
ticular values. Some of the values of these parameters will match those of the
target second language and some will not. In addition, through their knowledge
of their first language, adult learners know all sorts of idiosyncratic and non-
paradigmatic properties for at least one language. Very few, if any, of these
properties will match those for the target second language.
At the same time, all manner of non-linguistic knowledge is available to the

adult second language learner. Adults in contrast to children bring to the lan
guage learning task the benefits of adult cognition. They have knowledge about
the world, have developed problem solving strategies as well as a sophisticated
meta-cognition not observed in young children. All of these can be used by the
adult to both facilitate and disrupt their learning. However, it is important to
note that such knowledge is not the sole driving force in acquisition. We know
that adults do not learn language by a set of cognitive principles that also ac
count for their learning of how to play chess as suggested for example, by the
Fundamental Difference Hypothesis (for a more detailed discussion of this issue
see Flynn and Carroll, in press). In short, the kind of knowledge available to the
adult goes far beyond what has traditionally been envisioned for the adult lear
ner.
Knowing that these three bodies of knowledge are available to the adult
learner has several possible consequences for language teaching. Most generally,
it means that we can make certain assumptions about the adult learner's knowl
edge. We know that all learners will share knowledge of a certain common lin
guistic base, namely UG. We also know that divergences that exist among
learners will principally derive from differences that exist between the first and
second language of the learner, for example where parametric settings between
the first and second language differ. Knowing both of these facts allows us in
turn to establish more precisely what has to be learned: differences in par
ameter-settings. At the same time, we know that all or most learners will need to
learn the idiosyncratic properties of a language e.g. idioms, irregularities intro
duced by historical borrowings, individual lexical items (although not general
properties of the lexicon), among others. No theory of UG or any other knowl
edge base will give us these facts.
At another level, one consequence of knowing what is available to the lear
ner is that language instructors need to be linguistically sophisticated; they need
to understand the specifics of each of these knowledge bases. At one level they
need to be familiar with the basic principles and parameters of a theory of UG
in order to understand what general linguistic knowledge all learners share and
what specific linguistic knowledge learners have of their first languages. This
suggests that instructors need to be familiar with the linguistic properties of the
specific first languages represented by the learners in their classes in order to
understand where differences will emerge.
In addition, instructors need to be generally acquainted with the results of
current psycholinguistic research specifically that relate to language acquisition
and use. At the same time, they need to be familiar with theories of second lan-
206 SUZANNE FLYNN
guage acquisition that attempt to integrate all of these domains into coherent
meaningful explanations of the second language acquisition process.
With respect to the learners themselves, the availability of these three
bodies of knowledge for all adult learners means that in principle, all adults are
capable of learning new languages. Explanations about why some adults do not
learn second languages will have to appeal to factors not related to the basic bi
ological capacity for language, e.g. inadequate exposure to the target language
or other complex factors related to issues of motivation.
In terms of classroom composition, these results suggest that a mixed model
consisting of both heterogeneous and homogeneous groupings based on dif
ferences and similarities of parameter-settings of the first language would be
beneficial. We know that there are certain aspects of a new language that all
learners, regardless of their first languages, will have to learn, e.g. the idiosyn
cratic, and irregular properties, and those which only some learners will have to
learn, e.g. when parametric values differ between the first and the second lan
guage. Dividing the classes up in this way means that in the case of a match in
parameter settings between the first and second language students do not have
to be redundantly taught something they already know. In the case of the mis
match, it means that students can receive the additional input necessary for
them to assign new values to parameters.
3.2 How Is This Knowledge Used?
All three bodies of knowledge (general linguistic knowledge, specific first

language knowledge and general cognitive knowledge) enter into the adult lan
guage learning process. However, they do so in a highly interactive and con
strained manner.
3.2.1 General Linguistic Knowledge: Universal Grammar (UG)

UG knowledge means that learners bring to the language learning task a set
of predispositions to certain kinds of operations that can exist in languages.
Learners maintain general sensitivities about what are conceivable and possible
properties of language, and about what are legitimate and non-legitimate types
of moves that can be made in a language. For example, learners naturally know
in some sense that languages are hierarchically organized. They know that cer
tain kinds of "dominance" relations hold between constituents. To illustrate, in
sentence 4, her and Mary can refer to the same person. In contrast, in sentence 5,
Mary and she cannot refer to the same person.
(4) Near heri, Maryi saw a rock.

(5) Near Maryi, she*i saw a rock.
The reason for this difference has to do with differences in the dominance
type of relationships that exist between the pronoun and the noun. In sentence
4, her does not dominate Mary; that is, it is not higher in position than Mary in a
hierarchical tree structure of this sentence. In sentence 5, however, she domi
nates Mary; it is higher in the tree. A general rule of language, roughly para
phrased, states that pronouns cannot dominate their antecedents.
In addition, we know that learners will attempt to apply structure depend
ent hypotheses to the new target language. We know that learners will not com
mit certain kinds of errors that violate boundaries of abstract phrasal units, for
example formulate structure independent hypotheses. To illustrate, we do not
find sentences like that in 6 in the speech of adult second language learners (nor
in the speech of child first language learners).
(6) *Is the dog which in the corner is hungry?

(from Jenkins 1988:110)
Such sentences represent the application of a structure independent rule in

which the first verb in the sentence, regardless of its phrasal membership, is
fronted to form a question. If learners simply applied the rules that were based
on such structure independent notions of order in a linear string alone, we might
expect such an error. Such a question would by a simple analogy match that
formed from the sentence "The dog is hungry''/"Is the dog hungry?" The fact
that we don't find learners, even untutored ones, making these errors suggest
that they naturally apply structure dependent hypotheses to language.
3.2.2 First Language Knowledge

In addition, their knowledge of a first language interacts with and may at
times compete with their general linguistic knowledge (see related discussion in
Felix 1985). When it interacts rather than competes, knowledge of the first lan
guage facilitates second language learning. One way that the first language is
used is to determine whether or not new parametric values must be assigned to
parameters. Where the first and second language match in parameter settings
learners do not need to assign new values to these structures. Where they do not
match, the learner must assign a new value to the existing parameter. The first
language in this way determines what specifically has to be learned.
At the same time, we know that the first language can also interfere with the
second language process. At some non-parametric levels of language, although
208 SUZANNE FLYNN
not as yet fully specified, it appears that a lack of a match in properties may
cause problems in learning. For example, Oiler and Redding (1971) found evi
dence to suggest that the learning of articles (a, an, the) was disrupted for spea
kers acquiring English as a second language when the first languages of the
learners did not have article-like categories.
Somewhat paradoxically, we also know that the existence of certain com
parable properties in both the first and second language does not always facili
tate learning. For example, Clahsen (1988) reports that Turkish speakers
learning German as a second language will use a SVO (subject-verb-object) pat
tern in spite of the fact that both German and Turkish require clause-final verb
placement in embedded clauses.... "the generalization... holds regardless of the
learner's L1" (op. cit.: 61).
Phonologically, interference from the first language is commonplace. For
example, the observed inability of Japanese speakers in English to perceive or
produce the /r/ and HI distinction in English is argued to result from the fact that
/r/and/1/are not phonemically distinguished in Japanese and they are in English.
The lack of this distinction in Japanese is believed to interfere with the sub
sequent learning of this distinction in English. It is important to keep in mind,
however, that the interference function of the first language is not necessarily its
dominant role in the second language learning process.
3.2.2 General Cognitive Knowledge

In addition to UG knowledge and first language knowledge adults also have
general cognitive knowledge available to them. This means that the adult lear
ner can access a set of problem solving strategies not available to a child as well
as general knowledge about the world. Adults also maintain a heightened meta-
cognitive awareness at all levels as noted above. The adult can use this knowl
edge to gain and maintain control of her linguistic environment in a manner not
possible for a child. An adult is able to recognize breakdowns in communication;
she can elicit more linguistic input when necessary; she can isolate exceptions in
paradigms or locutions.
In addition, the adult is capable of understanding explanations about the
language and for certain aspects of language can use these explanations to en
hance their own learning. Adults may at the same time attempt to use this
knowledge to override linguistic hypotheses by bypassing structural decomposi
tion. For example, in comprehension, an adult second language learner can by
pass structural decomposition of an utterance itself and is able to integrate what
they know more generally about expectations of the task requirements in order
to solve particular tasks. It is also important to stress here again, however, that
general cognitive processes are not the only forces that drive the system as orig-
inally thought and espoused in many approaches to adult language learning.

Problem solving alone for language learning will not develop linguistic com
petence at the level needed to become a native or near-native speaker of a lan
guage. In fact, sole reliance on such strategies will prolong and in some cases
hinder language development.
Understanding how adults use their knowledge in language learning can be

used to enhance the design of instructional settings in several ways.
At the most general level, we know at least three general bodies of knowl
edge enter in the language learning process. While each uniquely contributes to
the process, acquisition is most likely truly facilitated when all three operate in
teractively. The system is probably at its worst state when either knowledge of
the first language or problem solving strategies are solely drawn upon or where
all three bodies are in competition. Thus, one challenge in terms of enhancing
language learning would be to create classes that interactively and strategically
draw upon these three knowledge bases and minimize interference from com
peting domains. For example, one would want to design language exercises that
cannot be accomplished through problem solving strategies alone. If such acti
vities become routine, one would end up "knowing" a language in much the
same way that one knows a series of opening gambits for a chess game.We need
to create activities wherein a linguistic solution would yield one result and a
non-linguistic solution would yield another solution. This is necessary in order to
get students to draw upon something other than problem-solving strategies
alone. It is important to know that students are not simply resorting to astructu-
ral strategies so that instructors are not lulled into believing that learners have
attained a certain level of language competence when in fact the students have
simply been good problem solvers.
At another level, knowing what is available to the learners helps us to an
ticipate their problems and strategies and understand the errors they make with
regard to the level and the domain from which they derive. For example, we
know that learners will not as a rule commit errors that violate basic linguistic
tenets, e.g. apply structure independent rules (sentence 6). Problems may, how
ever, emerge with respect to exceptional language facts, in cases of non-parame
tric competition between the first and second languages, or in places where a
general problem solving strategy has erroneously been applied. Knowing the
source of these problems provides opportunities to develop subsequent exer
cises or explanations that accurately and appropriately address the problem.
Such solutions, however, need to be based on a fairly intimate knowledge of
what is available to the learner and what kinds of intervention will yield results
for particular problems.
210 SUZANNE FLYNN
3.3 How Does Learning Take Place?
Given the nature of the knowledge available to the adult learner, we know
that there is a strong deductive component involved in language learning. This
means that language learners do not learn the new language by translating word
for word from the first language to the second language. They are capable of
looking for higher order conceptual units and will do so quite naturally when
given the opportunities by abstracting out from what they hear. Essentially, the
construction of the target second language is a grammar-driven process rather
than a data-driven one.
These results also suggest that learners will proceed through a natural se
quence of development guided by innate principles. While developing, these
learners will extrapolate from the language environment what they need when
they need it.How much of this is open to actual learning is still an important em
pirical question.
We also know that knowledge of their first languages can serve to facilitate
learning. Where there is a match between the first and second language, lear
ners will rely upon what is already available to them from their first languages.
For example, a Spanish speaker learning English has more available to her that
can be used when learning English than a Japanese speaker does. One way that
Spanish matches English is in its being a head-initial language. Because both
languages share this property, adult Spanish speakers do not have to re-learn
this fact about English. They can draw upon what they already know from Span
ish when learning English. Japanese, on the other hand as illustrated above in
sentence 3B is a head-final language;this means that Japanese speakers need to
assign a new value to the head-direction parameter in order to acquire English.
They also do so in a manner that corresponds to what children do when learning
English as a first language (for extended discussion see Flynn 1987; Flynn and
Lust in press; Flynn forthcoming).
We also know that in contrast to many theories about language learning,
adult second language acquisition does not proceed by random induction from
surface language facts alone. While some inductive learning is involved and re
search needs to isolate more precisely where, this learning is also highly con
strained. Of all the possible hypotheses and strategies an adult could use and
formulate when learning a second language given all the knowledge available to
an adult, adults simply do not apply non-linguistic hypotheses to the learning of
a second language. In fact, what is impressive about the adult second language
acquisition process is not the manner in which first and second language acquisi
tion seem to trivially differ but in the significant manner in which the two pro
cesses converge.
In terms of instructional settings, knowing how learning takes place has sev
eral consequences.
For example, as in first language acquisition, the learning environment must
be rich enough to provide the input necessary for the learner to deduce the right
properties of the target language. This suggests, as already documented for first
language acquisition, that the learner needs as much exposure as possible to
natural language. In addition, the language learning environment must be inter
active and directed to individual learners. While it is not always possible in a lan
guage classroom, the goal for language learning contexts should be to simulate
such an environment. Ideally, this interaction should be between two interlocu
tors; however, it is also conceivable that other forms of language exchanges can
provide some of this interaction in new and creative ways. For example, one can
imagine developing computer programs that respond immediately and appropri
ately to the learner such that they simulate but not necessarily substitute for the
needed one-to-one language "instruction" provided by caretakers with their
young children. Such work is the focus of the Foreign Languages and Literatures
computer projects being developed within the context of Project Athena at MIT,
for example.
The existence of a strong deductive component to second language learning
also strongly suggests that not all corrections are meaningful or useful. We know
from first language acquisition that one can with great effort get a child to cor
rect a previously ungrammatical utterance only to have the child resort to using
the ungrammatical utterance until she is really ready to change naturally. A simi
lar phenomenon is also often observed with adult learners. Part of the reason
why these corrections appear useless is that the type of input given to the adult
and perhaps the time at which it was given in development were simply
meaningless to the learner. It seems that the right kind of input is needed and it
must be given at the right time in order for such intervention to have any lasting
effect. The form of this input will also not always be in the form of an explana
tion as suggested above. It will more often than not involve more linguistic input
of a particular kind, for example expansions and paraphrases of key utterances
in as many varied syntactic structures as possible. Determining exactly what the
key utterances are is dependent upon the instructor's understanding of the na
ture of the error made. Determining when such input is useful is also dependent
upon one's knowledge of what developmental stage the student has attained.
Institution of such a program to do exactly this could easily be developed
with current technology in computer-aided instruction.
212 SUZANNE FLYNN
3.4 Consequences for Curriculum Development
Knowing what knowledge is available to the learner, how this knowledge is

used and how learning takes place raises a number of important issues in terms
of more specific aspects of auricular development.
All of these findings challenge many of our traditional ideas concerning the
organization of materials to be presented in a language classroom. Drawing
upon the principles and parameters approach, one might envision developing
curicular materials that are organized around the clustering of properties asso
ciated with the parameters. The clustering of such properties will not, in general,
correspond to surface structure facts of language in any neat way. They are often
concerned with fairly abstract relationships existent in languages. For example,
when teaching Japanese students English, one unit that could be developed
would be on the head-initial property of English. In so doing, one would want to
present the students with materials that dealt with noun phrase configurations,
verb phrase configurations, prepositional phrases, as well as complex sentences.
In this way, learners would be exposed to a range of linguistic phenomena asso
ciated with this parameter. At the same time, other properties that have been
linked to the head-direction parameter concern the anaphoric relationships that
exist between anaphors and their antecedents for example, the relationship be
tween the antecedent Mary and the anaphor her in sentence 4 above (see Lust
1986; Flynn 1987 for a more detailed discussion). Linkups such as these have
widespread implications in terms of other aspects of the language as well. For
example, they relate to the formation of complex sentences — specifically, the
formation of relative clauses (Mary saw the man who is my father). They also re
late to the use of redundancy in coordinate sentence structures; that is, forward
redundancy reduction patterns in English have been correlated with the head-
initial structure of English {Mary saw the man and ø shaked his hand) (see re
view in Lust 1986). While it is not yet clear whether "learning" or "acquisition"
will account for a second language learner's knowledge of all these facts, orga
nizing materials in this way may in fact represent a potentially significant ad
vancement over some traditional approaches to curriculum materials.
Current formulations of linguistic theory also challenge traditional notions
concerning complexity and simplicity. In many current classrooms and texts, lin
guistic development of the materials presented often progress in lockstep
fashion from simple one clause sentences, to questions, to two clauses—moving
from coordination to subordination — with thematically organized vocabulary
being simultaneously introduced in each unit. Given the view of language as
conceived within a theory of UG as a system of interacting modules guided by
principles and parameters, such an approach, however, may not be the most
beneficial to the learner or even the most relevant. Approaches based on

general cognitive notions of simplicity and complexity might dictate such pro
gression; approaches based on linguistic theory may not necessarily although at
times they may overlap.
This means that simple and complex within a UG framework, for example,
might roughly correspond to the sequence in which parameters are presented
and the order in which the clusterings of associated properties are presented for
the parameter.
For example, based on the first language of a learner, one would first deter
mine what had to be learned in terms of parameters. Then one would establish
an ordering of focussed presentation that might roughly correspond to the order
in which they emerge in for example child first language acquisition or in terms
of naturalistic second language learning (the two should be essentially equival
ent). With respect to each parameter, the order of properties focussed on might
first begin with what is most regularly observed in languages and then progress
to their regular, exceptional properties. One might also consider incorporating
in this context traditional notions of simplicity and complexity by first focussing
on clusterings within a phrasal unit and then expanding out to more complex
phrasal units. In this way, materials presented to the learner are linguistically or
ganized on several different levels simultaneously.
With respect to standardized tests, these findings raise questions about
whether traditional tests designed to evaluate the linguistic competence of a
learner provide reliable measures of their competence. In the context of princi
ples and parameters approach, linguistic knowledge goes far beyond an ability to
distinguish between "who" and "whom" or to use correctly the past tense. In
order to determine exactly how developed a learner's linguistic competence is,
one would want to develop tests that measure such things as knowledge of a par
ticular parameter and its associated clustering of properties as well as how well a
learner has integrated this linguistic knowledge with all other related domains of
language learning.
4 Conclusions
In summary, the purpose of this paper was to explore the possible implica
tions of linguistic theory for language pedagogy. What has been discussed in this
paper is only a fragment of what can ultimately be achieved and tested. Conti
nued study and dialogue between the two domains explored in this paper can
yield the insights necessary for the continued development of both principled
language learning environments and ultimately principled theories of language.
214 SUZANNE FLYNN
Acknowledgements
@@@The author wishes to thank Ralph Ginsberg, Claire Kramsch, Charles Ferguson and Jack Carroll
for discussions concerning various aspects of the issues addressed in this paper as well as for the
many suggestions made for revision with respect to an earlier version of this paper. The author
would also like to acknowledge the participants at the Bellagio conference for their helpful com
ments and questions.
Notes
1. Important to note is that the discussion in this paper will center principally on the acquisi
tion of linguistic knowledge. This is not to say that this is all that one needs to learn in order
to become a native or near-native speaker of a language. Discussion of the acquisition of
other necessary properties and components, for example the target culture, can be found in
several other papers in this volume. Discussion of such issues is, however, beyond the scope
of this paper.
2. Technically, we can account for these facts in terms of principles of Binding Theory pro
posed within a theory of Universal Grammar. For a detailed discussion of the specific de
tails, see Chomsky 1981 as well as Lasnik and Uriagereka 1988; Radford 1988.
3. The trace (t) in both sentences 2c and 2e indicates the position from the wh-word (who in
this case) has been moved in order to form a question. The t is a type of place-holder. In sen
tence 2c, the t indicates that the wh-word was the object of the verb,... visit who. In sentence
2e, the t indicates that the wh-word was the subject of the infinitive clause, who to visit Bill.
The asterisk in sentence 2f indicates that this sentence is ungrammatical.
References
Bley-Vroman, R. 1989. "What is the Logical Problem of Foreign Language Learning?" Linguistic
Perspectives on Second Language Acquisition ed. by S. Gass and J. Schachter, 41-68. Cam
bridge: Cambridge University Press.
Chomsky, N. 1975. Reflections on Language. New York: Pantheon Press.
Chomsky, N. 1980. RulesandRepresentations. New York: Columbia University Press.
Chomsky, N. 1981. Lectures on Government and Binding. Dordrecht: Foris.
Chomsky, N. 1986. Knowledge of Language. New York: Praeger Press.
Clahsen, H. 1988. "Parameterized Grammatical Theory and Language Acquisition: A Study of the
Acquisition of Verb Placement and Inflection by Children and Adults." Flynn and O'Neil
1988.47-75.
Cook, V. 1988a. The Relevance of Grammar in the Applied Linguistics of Language Teaching. Ms.
University of Essex, England.
Cook, V. 1988b. Chomsky's Universal Grammar: An Introduction. Oxford, England: Basil Black-
well.
Dulay, H. and M. Burt. 1974. "Natural Sequences in Child Second Language Acquisition." Lan-
guage Learning 24.37-53.
Dulay, H., M. Burt and S. Krashen. 1982. Language Two. Oxford: Oxford University Press.
Felix, S. 1985. "More Evidence on Competing Cognitive Systems." Second Language Research
1.47-72.
Flynn, S. 1987. A Parameter-Setting Model of L2 Acquisition: Experimental Studies in Anaphora.
Dordrecht: Reidel Press.
Flynn, S. Forthcoming. "Eubanks Revisited: Response to Flynn Revisited." To appear in Second
Language Research.
Flynn, S. and W. O'Neil eds. 1988. Linguistic Theory in Second Language Acquisition. Dordrecht:
Kluwer Academic.
Flynn, S. and B. Lust. In press. "A Response to Bley-Vroman and Chaudron." Language Learning
June.
Flynn, S. and J. Carroll. In press. Second Language Acquisition. England: Longman Press.
Fries, C. 1945. Teaching and Learning English as a Foreign Language. Ann Arbor, MI.: University
of Michigan Press.
Jenkins, L. 1988. "Second Language Acquisition: A Biolinguistic Perspective." Flynn and O'Neil
1988.109-116.
Klein, W. 1986. Second Language Acquisition. Cambridge: Cambridge University Press.
Lado, R. 1957. Linguistics Across Cultures. Ann Arbor, ML: Univeristy of Michigan Press.
Lasnik, H. and J. Uriagereka. 1988.A Course in GB Syntax. Cambridge, Ma.: MIT Press.
Lust, B. 1986. "Introduction." B. Lust 1986.
Lust, B. ed. 1986. Studies in the Acquisition of Anaphora, Vol. 1: Defining the Constraints. Dor-
drecht: Reidel Press.
Martohardjono, G. and J. Gair. 1989. "Apparent Inaccessibility in SLA: Misapplied Principles or
Principled Misapplications?" Paper presented at the 18th Annual Linguistics Symposium.
University of Wisconsin-Milwaukee.
Newmeyer, F. 1983. Grammatical Theory: Its Limits and Its Possibilities. Chicago: University of
Chicago Press.
Newmeyer, F. and S. Weinberger. 1988. "The Ontogenesis of the Field of Second Language
Learning." Flynn and O'Neil 1988.27-34.
Oiler, J. and J. Redding. 1971. "Article Usage and Other Language Skills." Language Learning
1.85-95.
Radford, A. 1988. Transformational Grammar: A First Course. Cambridge: Cambridge University
Press.
Roeper, T. and E. Williams, eds. 1987. Parameter-Setting. Dordrecht: Reidel Press.
Rutherford, W. 1987. Second Language Grammar: Language and Teaching. London: Longman
Press.
Sciarone, A. 1970. "Contrastive Analysis: Possibilities and Limitations." International Review of
Applied Linguistics 8/2.115-131.
Sharwood-Smith, M. 1981. "Consciousness Raising and the Second Language Learner." Applied
Stowell, T. 1981. Origins of Phrase Structure. Ph.D thesis. MIT.
Culture in Language Learning:
A View From the United States
Claire Kramsch
Since the beginning of the decade, two developments have prompted

American foreign language teachers to give a renewed look at the relationship
of language and culture. First, political and economic realities have forced the
teaching of foreign languages to step out of its academic ivory tower and become
more closely linked to its practical outcomes for communicating with real
people in their natural environment. At the same time, progress in language ac
quisition research has broadened and diversified our conception of what it
means to be communicatively competent in a language. Sociolinguistic com
petence has been identified as a key aspect of successful communication; back
ground knowledge and shared assumptions have been shown to be a crucial
element in understanding oral and written forms of discourse. These develop
ments have run parallel, albeit a little later, to analogous trends that have taken
place in Europe since the 70's. At this point, on both sides of the Atlantic, it is
generally recognized that there is more to the successful exchange of meanings
than knowledge of forms and structures and even to their appropriate use — or
rather, that everything revolves around what one means by "appropriate use".
Foreign language educators in the US tend to lump this surplus of meaning
under the category "culture". But what do they mean by culture?
The difficulty in dealing with this topic lies in its unavoidable subjectivity
and relativity. Different countries have different political cultures, different in
tellectual styles, different societal fears, hopes, prides, different meanings and
values attached to language and culture themselves. The American common use
of the word culture includes traditions, beliefs, institutions shared by a social
218 CLAIRE KRAMSCH
group or a whole society; it has an ethnographic flavor to it. American foreign

language textbooks distinguish between big C, the culture of literary classics and
works of art, and small c, the culture of the four Fs: foods, fairs, folklore, and
statistical facts. Both are generally viewed as distinct from the teaching of lan
guage per se. German and French share the American meaning of culture
through their words Kultur and culture respectively, but they have also other
words that occur in conjunction with foreign language pedagogy: German lan
guage educators tends to use the word Landeskunde, which has a more geo
graphical connotation and generally covers all aspects of the geophysical,
political, economic, social, ideological context in which the users of the language
live and work. The French have two words: la culture generally refers to an ex
tensive body of knowledge as well as to qualities of mind and usage of the
French language acquired mostly through exposure to the French educational
system {culture is the privilege of the "cultivated" person, or "educated native
speaker".) La civilisation is roughly the lexical equivalent of the German Lan-
deskunde with a slightly greater emphasis on history and literature than its Ger
man counterpart.
I will try to take here an interdisciplinary perspective on the topic, staying
clear of the distinction between big C and small c, and concentrating instead on
what Attinasi and Friedrich (1988) call "linguaculture", that stresses the insepar
ability of language and culture (language in culture, language as culture and
other combinations), as constituting a single universe or domain of experience.
The example of the communicative approaches to the teaching of English as the
world's lingua franca, that have tended to promote a universally cultural mode
of communication based on pragmatic needs and functional imperatives, cannot
be extended in its pure form to the teaching of national languages. Rather than
seek ways of teaching culture as a fifth skill, similar to reading, writing, speaking,
listening, we have to explore the cultural dimensions of the very languages we
teach if we want learners to be fully communicatively competent in these lan
guages.
The concern over the link between the teaching of foreign languages and
foreign cultures is particulaly acute in the United States, because of the specifi
cally American educational tradition.
CULTURE IN LANGUAGE LEARNING: A VIEW FROM THE U.S. 219
1 Educational traditions in the teaching of language and culture.
1.1 Utility, democracy and scientific measures ofprogress (United States)
The teaching of foreign languages and cultures in American public schools

bears the mark of American educational history. Whereas in most other indus
trialized countries, the primary aims of education have traditionally been
general intellectual discipline and development of the mind, predicated on the
idea that the acquisition of a body of knowledge, of a mind culture or intellec
tual training, are good both for higher education and for life, American educa
tion since WW I has had different priorities. At a time of economic prosperity,
political stability and American supremacy abroad, the 1911 report of the Na
tional Educational Association's Committee of Nine and the Cardinal Principles
of Secondary Education of 1918 marked the deliberate reject of the public
schools as agencies for academic training; instead, they were to prepare students
for the duties of life, educate citizens for their responsibilities in the community
and increase their chances of employment (Hofstadter 1963:333). By contrast
with the earlier European academic view, promoted in particular by the Com
mittee of Ten in 1893, the NEA now promoted an American, non-academic
brand of public education based on utility, democracy and scientific measures of
progress.
Compulsory mass secondary education had to be "practical and pay divi
dends" (Hofstadter 1963:299) by increasing the individual's opportunities for
upward mobility. It had to be suited to all citizens, not just the college-bound
and had to meet the needs of students from various backgrounds, of different
abilities and of different motivations. In addition, it had to fulfill the tax-payer's
needs for accountability in a free-market society. The faith in tests and in the
truth of test results reflected a scientific positivistic view of learning: whether it
be faculties of the mind or of the body, ultimate competence was composed of
the accumulation of separate, discipline-specific skills, that were non-transfer
able to other areas of knowledge and could be tested by scientifically designed,
standardized tests.
This educational philosophy made the study of a foreign language a luxury.
Foreign languages were neither necessary to the pursuit of individual happiness,
nor for furthering the cause of mass democracy. Moreover, the teaching of
foreign cultures was seen as detracting from the goals of the melting pot ideo
logy. However, when after a period of steady decline, foreign languages picked
up again in the late 70's, they were viewed in the same educational spirit as all
the other skills taught in school. But now, these skills had to serve more complex
220 CLAIRE KRAMSCH
international goals. As a series of political setbacks in Iran and Afghanistan

raised questions about US diplomatic ability and as the Japanese economic mir
acle posed an unprecedented challenge to US international business, com
petency in foreign languages was viewed as a solution to the nation's problems.
The 1979 report of the President Commission on Foreign Languages and Inter
national Studies reflects less the spirit of the Helsinki accords than the pressing
national concerns of the time: "Nothing less is at issue than the nation's security.
At a time when the resurgent forces of nationalism and ethnic and linguistic con
sciousness so directly affect global realities, the US requires far more reliable
capacities to communicate with its allies, analyze the behavior of potential ad
versaries, and earn the trust and the sympathies of the uncommitted. Yet there
is a widening gap between these needs and the American competence to under
stand and deal successfully with other peoples in a world of flux... In our schools
and colleges... the situation cries out for a better comprehension of our place
and our potential in a world that, though it still expects much from America, no
longer takes American supremacy for granted... Our lack of language com
petencies diminishes our capabilities in diplomacy, in foreign trace, and in com
prehension of the world in which we live and compete" (Perkins 1980).
Such a statement is revealing on several counts. (1) It dramatizes the
American dilemma of wanting to exercise leadership on the international scene
without having the linguistic ability to do. However, it sees in foreign languages
mere tools for the accomplishment of other, more important American goals
such as "living and competing" and ultimately regaining supremacy in a world
which, like football games, is divided from the start into allies, adversaries and
uncommitted. Foreign languages are viewed here, as the rest of education, in
their utility to further American actions and beliefs. They are not linked to a
deeper cultural competence that would allow Americans to first understand the
world and thus distinguish their allies from their adversaries. (2) It implicitly as
sumes the universality of the US democratic, utilitarian system of thought by de
crying U.S citizens' diminished "comprehension of the world", but not their
diminished comprehension of their own society.
The President's Commission report struck a familiar chord among Ameri
can foreign language educators, who, unlike their counterparts in other coun
tries' educational systems, had been conditioned since 1911 to a democratic and
pragmatic view of education. The concept of "language proficiency" and oral
proficiency testing, taken over from government agencies in 1980, and the adap
tation of the Foreign Service Institute proficiency scale for educational purposes
reopen the question of the relation of language and culture. Since "cultural
proficiency" is much more difficult to measure than, say, speaking proficiency, to
this day it has not found its place in the Proficiency Guidelines of the American
Council for the Teaching of Foreign Languages (ACTFL 1986), although some
suggestions have been made for French.
In the debate surrounding the link between language and culture, I believe,
with Bourdieu (1967), that systems of education breed systems of thought and
that those systems of thought constitute a great deal of what we call the "cul
ture" of a given society. At the very least, it represents the value attached by a
given society to the phenomenon of language and culture itself.
Since foreign language education is governed in the US by some 1600
school boards across the country, it could be instructive to look at the Gui
delines currently issued by the Boards of Education of the respective states to
see how they suggest integrating the teaching of language and culture.
1.1.1 Skill vs. content: The States' Boards of Education Guidelines.

Unlike other school systems, in which foreign languages are studied over a
long period of time in small increments (see for example for France, Porcher
1983), in the US foreign languages are taught in a relatively intensive manner
for a short period of time (one or two years on the average between the ages of
12 and 18); they are rarely compulsory and rarely accompanied by travel or study
abroad or by history/geography courses that deal with the target country. Culture
is commonly seen as making the study of a language more attractive, and as pro
viding a welcome relief from grammar and vocabulary exercises. Learning about
a foreign culture is not expected to require any intellectual effort, since it is
generally conceived only as the tourist's view of foreign ways of life. The teach
ing of a foreign culture is seen as distinct from "language training" — many feel
that it could be done almost more efficiently in English.
The States' Guidelines spell out the suggested or mandated political goals
of foreign language education in US public schools, their non-academic objec
tives and their academic learning outcomes.
1.1.2 Political goals

The primordial impetus and ultimate purpose for learning foreign lan
guages in most of the states at this time is what the President's Commission has
called "this nation's security". This gives a tone of urgency to most of the states'
guidelines, that claim such goals as: meet the challenge of international econ
omic and technological advances (Utah); reinforce this nation's security (Penn
sylvania, Indiana); cultivate international understanding, responsibility and
effective participation in a global age (Wisconsin); permit effective participation
in the local, national and international community (Pennsylvania), in an interde
pendent global society (Michigan); foster cross-cultural awareness (Texas); re
duce provincial biases, help recognize and respect differences among people
222 CLAIRE KRAMSCH
and cultures, bring about world peace (Hawaii); provide all students with cultu
ral and linguistic sensitivities necessary for world citizenship (Connecticut).
1.1.3 Non-academic objectives.

The states' guidelines reflect the non-academic tenets of American educa
tion.
- Life-adjustment. "Foreign languages prepare students for success, not just

for admission to college... all students should expect the development of fo
reign skills that are usable in real life contexts... teachers should think
about language learning in terms of students' proficiency to do, not only in
terms of grammar or seat time" (Springfield, MA); "a consensus exists
among foreign language professionals, in the body politic, among students
and parents and among educators in general that the main goal of foreign
language learning should be the ability to demonstrate practical, meaningful
use of the language" (California); "the US need a citizenry competent in fo
reign languages because more people than ever go abroad for business,
pleasure and education" (Pennsylvania); Americans need "skills to partici
pate in the international business market" (Connecticut, Michigan); "fo
reign language study is a vital factor in Michigan's economy) (Michigan).
- Accessibility to all. "Every American public student should have the oppor
tunity to acquire proficiency in a second language" (Wisconsin); "Study of a
foreign language is not an educational luxury, should not be limited to the
upper 25% or to those going to college" (Indiana). "Foreign language study
doesn't require superior control of one's native tongue to succeed "(Spring
field, Mass.). "Foreign languages do not represent an elitist, anachronistic
view of education" (Wisconsin).
- Career opportunities. "A foreign language is a marketable skill" (Wiscon
sin), "means employability" (Michigan), "career opportunities" (Connecti
cut, Utah).
- Accountability. "What is taught should be tested... testing is an integral
component of instruction" (Virginia); "Every course taught in Arkansas pu
blic schools develops identifiable skills, the mastery of which can be asses
sed by performance tests" (Arkansas)' Foreign languages provide a
"testable skills continuum" (Kentucky).
- Discovery of American diverse cultural heritage, maintenance of American
cultural values. The goal of foreign language learning is "to understand and
appreciate people of different nationalities and ethnic groups and their con
tributions to the development of our nation and culture" (Virginia); "helps
understand one's ancestors" (Utah). "Students should develop a deeper
comprehension of their own culture by exploring another" (Texas, Virgi

nia).
1.14Academic learning outcomes

Besides general educational goals such as "the acquisition of logical, criti
cal, creative thinking skills" (Michigan), the states' guidelines set linguistic and
cultural goals to foreign language study. It should "teach the basic listening,
speaking, reading and writing skills which will lead to the ability to think and to
communicate in the language" (Hawaii), to "proficiency" (Texas), to "mastery"
(Pennsylvania); "provides occasions for observing French culture and behaving
in ways appropriate to it" (Springfield, MA). "Communication in the foreign
language should be the major objective and the dominant activity in foreign lan
guage classrooms" (California).
The discrepancies between the pragmatic and the idealistic, the non-aca
demic and the general educational, the national and the international goals of
US education are striking. How can intercultural understanding arise from a
skill-oriented, behaviorally conceived foreign language proficiency? Do global
understanding, cross-cultural awareness automatically grow out of being able to
master the present tense, order a meal in a restaurant or handle social situations
(ACTFL Guidelines 1986)? How can critical thinking emerge from the unques
tioned American view of the pursuit of happiness? How can world peace, effec
tive participation in an interdependent global society result from the adversarial
view of the world suggested by the President's Commission? Finally: How can
international, intercultural goals be tested on a ACTFL proficiency scale that is
typical of American educational culture? Connecticut is one of the only states
that shows concern in this regard: "As long as foreign language teaching em
phasizes only skill development in a second language, global education will not
be a vital part of the foreign language curriculum. However, if emphasis in
foreign language instruction is placed upon the way language and culture inter
act and influence the way one sees the world and upon the role language itself
plays in the interdependence of nations, there is a strong relationship between
foreign language education and global education" (Connecticut).
It is interesting to contrast American objectives in the teaching of language
and culture with the stated goals of foreign language education in secondary
schools in Europe. Neither in France nor in West-Germany do foreign lan
guages require the extensive legitimation needed in the US for historic reasons.
One or two foreign languages are the standard fare of all students completing
their secondary education. However, government guidelines do specify in both
countries what the learning outcomes should be within the general educational
philosophy of that country.
224 CLAIRE KRAMSCH
1.2 Systems of language and systems of thought (France)
The political reasons for studying foreign languages, for example English or
German, in public schools in France are stated in sober and realistic terms.
"English is spoken as a first or second language in many countries in the world".
"We cannot overestimate the particular importance of our relations with those
countries where German is the state language, and which constitute a linguistic
and cultural community of major demographic, economic and political import
ance. The only one in Europe that serves as a bridge between western and east
ern bloc countries" (Instructions ministérielles 1986).
The goals of foreign language education are threefold:
- the linguistic goal is to acquire automatic behaviors {automatismes) in the

use of the language through appropriate training. Grammar should be pre
sented and taught not only in its morphosyntactic form, but also in its func
tional semantic aspects. Oral skills are primordial but should be developed
through the study of written texts, read aloud, summarized and discussed.
Role-plays and other communicative activities should not be overdone.
Training in the written language is to develop not only grammatical and
lexical accuracy but also basic rhetorical skills, such as "logical coherence of
demonstration, chronological coherence in narration, spatial coherence in
description".
- the cultural goal is to acquire knowledge of the daily life, the political, soci
al, economic organization, of the artistic and literary production and of the
major historic events of the country under study.
- the educational goals are primarily conceptual. Study of German, for exam
ple, should "broaden students' intellectual horizon, develop their apprecia
tion for effort, method and rigor, and refine their intellectual, esthetic and
moral judgment and sensibility".
These goals reflect the traditional French belief that the acquisition of lan
guage is the formation of mental structures, that learning to talk is learning to
think, and that social acceptability in French society and abroad is not only a
question of using grammatically correct sentences, but employing the patterns of
thought of the dominant, i.e., educated, discourse of that society (Bourdieu
1982). Of course, this view not only acknowledges social differences in the native
language, but replicates them in the acquisition of a second. By teaching foreign
languages, the French educational system fulfills its mission of furthering up
ward social mobility through an awareness of the intellectual value of language
and culture per se.
The question is: Do these conceptual outcomes further any more the cause
of intercultural understanding than the American? Does every one in the world
share the French view of the importance of the intellect? To what extent is up
ward mobility always linked to the ability to speak and write well in different
languages and to know other cultures? A more serious question raised by the
French model is the following: Is intercultural understanding linked to a specific
kind of social literacy and thus inevitably class-specific?
1.3 Language learning and political consciousness (Hessen)
Since education in the FRG is in the hands of the Ministries of Culture of

the individual Länder, I will take only one example, that of Hessen, whose Gui
delines are considered to offer a model of democratic progressive education
(Rahmenrichtlinien 1980). The goals of foreign language education are stated
there as furthering the development of the learner's personality through the ac
quisition of information and the ability to reflect critically on that information.
More specifically, and in order of priority, knowledge of a foreign language is
claimed to:
- enable students to autonomously gather information from foreign sources

and increase their chances of becoming informed citizens,
- enable them through the exercise of critical reflection, to bear political re
sponsibility and to contribute to the shaping of the community.
- increase their professional qualification. At least one foreign language
should be accessible to as large a student population as possible (op. cit.: 11)
The emphasis on these goals, as opposed to the more conceptual French

goals, is understandable, given both recent German history and Germany's en
lightenment tradition in education. Intellectual goals are closely linked to
Habermas' "demystification" of ideologies (Habermas 1970), moral objectives
have to do less with self-esteem and personal values (American tradition), per
sonal integrity and intellectual purity (French tradition) than with conscious so
cial and political action. Thus message content, not rhetorical style, is of prime
importance in the development of communicative competence. Learning a lan
guage is not a linguistic or pragmatic game. Reading activities, group discussions
in the classroom are meant to develop the ability to collect information, to con
vey and understand intentions and above all, to reflect upon them critically.
As we examine the West-German goals and match them with the current in
ternational demands in education, several questions arise: Does the emphasis on
226 CLAIRE KRAMSCH
the content of information at the expense of its form and structure not present
an incomplete picture of intercultural communication? Intellectual styles (Gal
tung 1981) or patterns of thought (Bourdieu 1967) are socially and culturally
determined and are so inseparable from the informational content transmitted,
that communication breakdowns occur more often than not at the level of dis
course, not at the level of the facts presented. Some of these breakdowns are ap
parent, for example, when American students of German are given textbooks
written by Germans to teach German as a foreign language and are asked to
adopt a culturally different learning style.
The second question concerns the emphasis put in the Hessian Guidelines
on "demystifying" current ideologies. Since there can be no non-ideological ab
solute standpoint, it might be more useful to think in terms of the negotiation
and joint construction of a reality that is agreed upon as a safeguard against
communicative intolerance.
Finally, there is no easy passage from reflection and enlightenment to ac
tion. If the American view might be seen as too much focused on language as a
tool for action, the European view might be considered to be too concerned
with language as an object of linguistic or social reflection. Both views illustrate
two complementary aspects of culture: culture as performance, culture as com
petence, to which I will now turn.
1.4 Culture as performance and competence
The need to account for the cultural dimensions of language forces us to re
view the traditional, positivistic conceptions of quantitative, normative, linear
language learning.
There is a noticeable gap, for example, in most of the US boards of educa
tion guidelines, between the intercultural goals of foreign language education
and their behavioristic theory of language. In Virginia, language is seen only as a
set of symbols or tools; "Language is a set of symbols used by people to convey
meaning. ... tools for transmitting thoughts and ideas" (VA vol. 1, p. 5) "Lan
guage learning includes the acquisition of many skills". (PA p. 5) It is not clear
how purposeful, intercultural communication emerges from the acquisition of
isolated skills. In New York, we note the same discrepancy between the goals
and the means. The communicative goals are no less than "the ability to under
stand, respect and accept people of a different race, sex, ability, cultural herit
age, national origin, religion and political, economic and social background as
well as their values, beliefs and attitudes" (p. 4). But to reach these goals, tea
chers are encouraged to "use principles of mastery learning, and to use informal
and/or formal testing to assure achievement of the objectives". In the Oklahoma

Guidelines, we read on the one hand: "The ultimate goal is the student's profi
ciency in the French-speaking world. Attainment of this goal may also bring
about an awareness of self and a reassessment of personal values" (p. 4), but on
the other hand, testing the achievement of those goals is suggested in the follow
ing manner: "Demonstrate curiosity about the French culture and empathy to
ward its people. Example: Students might experiment with new foods such as
snails, truffles, frog legs or cheese; try a new sport, such as soccer, or make the
effort to get into a letter exchange with a French teenager" (p. 46). Experiment
ing with snails and truffles hardly shows deep understanding of and empathy to
ward French attitudes and values.
In a recent assessment of communicative language tests, Hart, Lapkin and
Swain (1987:89) take a sober view on the attempts to measure both the linguistic
and the cultural dimensions of communicative competence of students in com
municative language teaching programs such as the French immersion programs
in Canada. They show that even though these tests were holistic in conception,
the test measures developed in most instances approached discrete-point, single
trait indicators. "Overall operationalisation of the dimensions of communicative
competence essentially required operationalising single skills — in other words
discrete-point measures". They conclude with the realization that "without a
framework to interpret outcomes in relation to learning process and learning
opportunities, the pedagogical value of the results is severely restricted. In other
words, the appropriateness of task-based criterion-referenced tests to general
language education is largely dependent on possessing a prior framework for in
terpreting results". And that framework, as we know, is culture-bound itself.
2 Current efforts to link the teaching of language and culture in the

United States
In the wake of the President's Commission Report following the 1975 Hel
sinki agreement, several developments on the national level have served to
make foreign languages part of a general push for the internationalization of
American education.
2.1 Internationalization of American education
Realizing the need to meet the demands of a "global society" in which the
U.S has to deal with trade deficits, competitiveness and disarmament issues, and
228 CLAIRE KRAMSCH
realizing in addition that the U.S is less and less a melting pot, but more and
more a permanently multicultural salad bowl, efforts are being made at the
federal level to internationalize American education. The American Council on
Education surveyed the following aspects of international studies at the under
graduate level in colleges and universities in the United States: internationally-
oriented majors, minors, certificate programs; foreign language instruction;
study abroad; faculty development, including support for travel abroad and cur
riculum development; visiting foreign faculty and lecturers, foreign students; in
ternationally-oriented library resources; institution-to-institution linkages
overseas.
Its findings draw a sobering picture of the pervasive anglocentric orienta
tion of American higher education (Lambert 1989).
2.2 National Foreign Language Center
Under the direction of Richard Lambert, the NFLC convened a special task
force on the teaching of language and culture in September 1987 to explore pe
dagogical needs and existing materials. Participants included scholars in English
and comparative literature, anthropology, foreign language acquisition, linguis
tics and representatives of governmental agencies and business corporations.
Participants agreed that culture is both something you perform and something
you learn about. The discussion addressed questions related both to cultural
performance and cultural competence. Related to cultural performance were
such issues as: What can best be learned by living in the country, what is best
learned in a domestic instructional setting? at what stage should the learners
learn which features? How can teaching culture be adapted to the purposes of
the learner in a foreign country? What is an appropriate unit of teaching: the
communicative situation? behavioral segments? speech acts? How should we se
lect which cultural features to teach: their generalizability across situations?
their capacity to be matched with linguistic features? How can we measure cul
tural competence: directly, indirectly? How can recent technological advances
help in the teaching of culture? Eleanor Jorden's interactive video material for
teaching Japanese cultural performance represents a major step in the right di
rection (Jorden 1989) and so do other efforts, particularly for the less commonly
taught languages, such as Hindi (Gambhir 1987, forthcoming).
With regard to cultural competence, the planning group discussed the
general education goals of the teaching of culture: cultural aspects of discourse
and conversational style, critical reading skills transferred from mother tongue
reading classes, multiple perspectives on C2 read in the learners' native lan-
guage or in the target language, initiation to a nation's imaginative universe of

dreams, myths and self-perceptions as contrasted with the learner's own native
imaginative universe. The MIT Athena language learning project, in particular
Furstenberg's and Morgenstern's interactive video material represent original
advances in an exploratory pedagogy for the development of cultural com
petence (Murray et al. 1989). Cultural competence can best be developed in a
structured learning environment, where conscious parallels can be drawn, where
language can be explicitly linked to its meaning in a particular sociocultural and
historical context, where disparate linguistic or cultural phenomena can be
brought together and attached to more abstract principles of both base (C1) and
target (C2) language and culture. Teachers should continually deepen their un
derstanding of both C1 and C2 by reading studies from a variety of sources that
help identify and analyze cultural patterns in the series of isolated cultural facts
which they experience or teach about.
23 Cultural Proficiency Guidelines.
The American Association of Teachers of French (AATF 1989) has sug

gested adding cultural proficiency to the ACTFL speaking, listening, reading
and writing proficiency guidelines (ACTFL 1986). It defines cultural proficiency
as a combination of three interrelated parts: the sociolinguistic skill of com
munication, certain areas of knowledge, and certain informed attitudes. Here
are some excerpts from the basic cultural competence or "Minimal Social Com
petence" as described in the AATF Culture Guidelines. It corresponds to levels
novice through advanced on the ACTFL language proficiency scale.
- sociolinguistic ability: can meet all the demands for survival as a traveller
...; can handle any common social situation with an interlocutor accusto
med to foreigners: make requests politely, offer and receive gifts and invita
tions, apologize, make introductions, and discuss some current events or
policies, a field of personal interest, a leisure-time activity of one French-
speaking country; can participate in a conversation if conducted in "français
soigné", perhaps asking to have some expressions repeated or paraphrased;
manage to convey an attitude of good will via tone of voice and nonverbal
means.
- knowledge: can interpret simple menus, timetables ...; beyond the survival
level, knows about the phases of "culture shock" and how they may affect
perception; can identify the truth or untruth implied in the stereotypes of
his or her home culture and of French culture;... can name at least two pre-
230 CLAIRE KRAMSCH
sent political parties in France, and two or three major contemporary issues;
can describe or give examples of qualities prominently sought in French
education, such as clear expression and organization of ideas, knowledge of
French history and geography, and literature; ... can describe in broad out
line the main geographical regions, the political institutions, the public-edu
cation system, and the mass media of France or another French-speaking
country; can produce a few proverbs or stock phrases which reflect a world
view often encountered there; can say how that country's institutions, regu
lations, and customs such as attitudes toward behavior and appearance in
public, may affect him or her as a foreign traveller (or student, trainee, busi
ness person); ... can identify, in a literary or a journalistic text, examples of
elevated style and of familiar and popular expressions, and in reading, can
point out some of the verbal indications of attitudes, hidden quotations or
allusions. (op. cit.:15-16).
- informed attitudes (desirable at the basic level, indispensable at the supe
rior level): curiosity about discovering similarities and differences between
one's home culture and French culture; ... without losing one's own identi
ty, a basic desire to accommodate to the norms of the foreign society; the
determination to avoid over-generalization and stereotyping; awareness of
the fact that one's perceptions and judgments are patterned by one's home
culture, and are subject to temporary influences such as the phases of cultu
re shock; a critical approach to statistics and opinion polls: a concern to
know the date and scope of the evidence, even if one is not able to judge the
credibility of the agency; a fair-minded, relativistic appreciation of cultural
differences to the point of being able to present objectively some judgments
that foreigners make concerning one's home country. (op. cit.: 14)
These first steps towards a classification of cultural performance and com

petence show how risky any attempt at developing a national instrument for
evaluating cultural competence is bound to be. Indeed, the efforts made by the
AATF to define and measure cultural proficiency bring to light some of the
major current obstacles to the integration of language and culture in the teach
ing of foreign languages.
3 Current obstacles
3 1 The notion of "global village" and the assumed universality of modern

managerial, commercial, scientific cultures over the national
historical/humanistic (Pfeiffer 1988:52)
This view conceals more subtle forms of ethnocentricity in the guise of

universal pragmatic needs. For example, the Cassandra cries about the loss of
U.S supremacy in business and diplomacy are giving the learning of foreign lan
guages and cultures self-serving, promotional incentives that might skew any ge
nuine understanding of other world views.
For example, in the United States, the fact that the 27 million Hispanics will
be 30 million by 1990 and 40 million by the year 2000 has aroused a sudden in
terest of business firms in the Hispanic market. Commercial ads are becoming
extremely sensitive to the culture of the target clientele: COORS beer, for
example, is not shown drunk by individuals in bars, but at home, in a climate of
sharing and family togetherness. And yet, the anchorman lumps all Mexicans,
Salvadorans, Guatemalans, Nicaraguans, together and refers to them with the
US political term "Hispanics", a label no Mexicans or Salvadorans would use to
characterize themselves.
Similarly, a recent article in the Boston Globe insists that it is not enough
for immigrant professionals to speak perfect English, they must "speak Ameri
can", what specialists in the newly emerging field of multicultural work force
management calls "pragmatic strategizing" (Fliegel 1987). To quote from the ar
ticle: "An innovative computer specialist, Wei-Jing Chen organized and
presented his ideas in an "emblematic mode". From the Chinese point of view,
this mode, which is deferential, anecdotal and circuitous, seeks to address an
issue by describing the surrounding terrain. The great strength of this approach
rests in its patient thoroughness and its collectivist emphasis on reaching group
harmony by avoiding direct conflict. To Americans, however, it sounds vague
and is often too oblique to grasp. For Wei-Jing the key was learning how to take
full advantage of the American modes of organizing and presenting ideas. By
making repeated impromptu presentations in a series of private sessions, Wei-
Jing learned to organize his ideas and to speak on his feet". Similarly Maria Ro
driguez, a Peruvian doctor had a more self-contained conversational mode than
the American mode, a higher tolerance of silence and she imposed less of a re
sponse imperative. Saying something to a colleague or a superior just for the
sake of responding seemed to her presumptuous. Her American colleagues con
cluded she lacked motivation and interest in her work and she was not pro-
232 CLAIRE KRAMSCH
moted. But through training she acquired responsive modes and solved her
"problem". The article ends with her comment, which "sums up the experience
of many immigrant professionals with pragmatics: "Before I didn't realize what
people were expecting from me. Now I feel free to speak, free to contribute".
The slightly uncomfortable feeling we have in reading these examples of
successful acquisition of cultural performance is that it doesn't seem to be ac
companied by any gain in cultural understanding neither on the part of the
clients nor on the part of the American journalist who tells the story. On the
contrary, the closing lines of the piece seem to reinforce the ideological stereo
type of America as the land of freedom symbolized by American "free" conver
sational style. A similar ethnocentric result would be achieved if ESL teachers
used the film Crosstalk (Gumperz, Jupp and Roberts 1979) as behavioristic
training for successful job interviews without at the same time increasing the
cross-cultural sensitivity of both parties to the cultural dimensions of discourse.
3.2 The conduit metaphor f or language. Influence of information-processing

theories on the way acquisition of knowledge and intercultural
communication are perceived to take place
The "conduit" metaphor, first coined by Michael Reddy (1979), expresses

the ail-too pervasive notion that language is but a mere conduit for information,
like a water pipe or a tobacco pipe, a closed and culturally-neutral system of lin
guistic forms and structures. Frank Smith decries this view in a recent article en
titled: "A Metaphor for Literacy: Creating worlds or shunting information?":
"Our perceptions of literacy are narrowed if not distorted by the pervasive tend
ency, in education as well as in language theory and research, to regard language
solely as the means by which information is shunted from one person to an
other" (Smith 1985:195). Smith echoes here the concerns of other linguists, such
as Joshua Fishman who regrets the absence of language consciousness in much
of modern educational culture: "In our popular culture and even in much of our
intellectual culture, language is viewed as merely a means of communication"
(1982:5).
3.3 Emphasis on imperative knowledge. Influence of computer science technology

on the way knowledge is used
Emphasis is generally on the acquisition of "imperative" knowledge (how to

do things and how to make people do things) through the transmission and ex-
change of information, not on "descriptive" knowledge (how to express and un

derstand things) through interpretation (Sussman and Abelson 1985). And yet
we know that the basis of culture is not only shared knowledge but shared rules
of interpretation (Garfinkel 1967), what Galisson (1987) calls the "CCP" of
words, or "charge culturelle partagée".
This obstacle is compounded by positivistic tendencies in education, that
consider testability as a criterion of teachability (Coste 1980). The AATF sug
gestions for the testing of cultural competence mentioned above illustrate well
how impossible it is to rate fairly a holistic competence through discrete point
measures, and how the whole gets lost if one tries to equate it to the sum of its
parts.
3.4 Television and the illusion of immediate mediation between cultures
Television's ability to bring whole new worlds into one's living room and its
total claim on domestic and foreign reality hide the fundamental socio-centrality
of the medium as a model of meaning production (Ong 1977; Fiske and Hartley
1978; Gumpert and Cathcart 1982; Geisler 1985; Kozloff 1987). Television vie
wers have been socialized into seeing foreign countries and events through the
cultural discourse of their own television or, if satellite and other reception per
mit, through the cultural codes of a foreign television discourse. The fact that a
society's television programs reflect a certain cultural consensus on the way so
cial reality is viewed, makes the medium into a unique tool for teaching foreign
cultures as they are presented through foreign television; but at the same time,
because of its appearance of universality, television can be the greatest obstacle
to appreciate and understand cultural differences, if it is not critically "decon
structed" and placed in its own cultural discourse framework. For example,
there is a long German tradition of foregrounding the process of narration in
television films: the story may start with the end and retrace the events that led
to it, the filmmaker/narrator may appear in person to give metacomments on the
narrative, thus destroying the filmic illusion. In addition, because of the tradi
tional lack of interruptions for commercial purposes, German television viewers
have long attention spans and enjoy slow reflective narrative styles. By contrast,
American cinematic style prefers to obliterate all traces of enunciation: narra
tion is chronological, events unfold linearly as in "real time", viewers are able to
identify with the characters. Uninformed American viewers of German televi
sion films tend to find the lack of American-type suspense and action discon
certing, and the pace too slow; they confuse the lack of identification
possibilities with "intellectualism".
234 CLAIRE KRAMSCH
3.5 Lack of a theoretical framework for the discussion of culture and for
contrastive cultural analyses
Whereas the teaching of language draws on some descriptive nomenclature

based on a theory of language, the teaching of culture is left with its anecdotal
experiential base, or is forced into the theoretical framework of other disciplines
like history, sociology, anthropology, semiotics etc. In itself this might not be a
drawback, but it does mean that teachers of culture must consciously straddle
multiple disciplines and integrate their respective insights for themselves before
they can teach such an integration to their students. Teacher development
should broaden its traditional narrow philological or literary focus. Suggestions
have been made recently in this respect by McLeod (1976) Müller (1980, 1981)
and Kramsch (1983, 1987a, 1987c, 1988b).
4 Refraining traditional questions
The current efforts and obstacles outlined above prompt us to reassess the
traditional questions asked of foreign language education in instructional set
tings.
A question that is often asked by policy makers and administrators in the
United States is: Is a foreign culture learned best in a domestic instructional set
ting or by living and studying abroad? Although the question is, educationally
and financially, a valid one, there is to date no conclusive evidence to show that
study abroad per se leads to cross-cultural understanding, or to the development
of the cross-cultural personality. More needs to be known about what it is exact
ly students learn when they go abroad (for a discussion of this topic, see Lambert
1989).
Rather than reduce the issue to either/or dimensions, we should be con
cerned about the appropriate balance of cognitive and experiential learning that
are both equally essential to the acquisition of cultural competence and perfor
mance. One of the advantages of studying abroad is not only in the practice of
the forms of the language, but in the exposure to other intellectual styles, other
ways of framing questions, other ways of interpreting social, historical, political
facts. But even there, experiencing these different discourse forms does not
make them meaningful without conscious cross-cultural reflection. This takes
place at best in instructional settings.
A related question is the following: Is a foreign language course the best
place to teach culture, or is it best taught in a separate course taught in the stu
dents' native language?
This question assumes that a language can be taught without teaching the
way in which that language expresses the world view of the social group or so
ciety that speaks it. It assumes that one first learns skills then content. We must
reframe the question as follows: What is the appropriate balance of the develop
ment of socialization and literacy in the foreign language? The question is not:
Socialization or literacy? but: When and how much should we teach how to per
form social acts in the language, when and how much should we teach how to in
terpret oral and written texts? (Kramsch 1987b).
In recent years, much has been made of "content-based" instruction for the
teaching of foreign languages. The most notable experiment has been at the
Lauder Institute School of Management, where advanced students attend lec
tures in their fields conducted in the foreign language. These and other immer
sion experiences have raised the question: What is the best way to teach the
advanced levels: language courses or content courses taught in the language?
Content courses are obviously an excellent way of using the language for various
academic and professional purposes. The evidence is not yet in concerning the
effect of these courses on the linguistic proficiency of advanced learners. How
ever, from a cultural point of view, the question is wrongly posed.
If these courses are to impart not only knowledge of foreign events, but also
a foreign discourse style, one has to ask: From what cultural perspective are
these courses taught? Which point of view is represented in the transmission of
cultural knowledge: that of the base or that of the target culture? the "busi
ness/management" point of view or other intellectual points of view also? Cul
ture can only be really understood within relational systems of thought, indeed
within "ecological" forms of pedagogy, in the Batesonian sense (Bateson 1982).
If the perspective of the lecturer cannot and indeed should not be avoided, a lec
turer, say, in the field of political science, should be able to convey to students
the cultural slant of the discourse of his/her discipline.
As renewed efforts have been made to link the teaching of foreign lan
guages to practical usages outside the classroom, there has been much concern
about which aspects of the language should be taught. Hence the traditional
question: If only short times of exposure are available, should schools teach
basic language skills or general education competencies? This question again as-
sumes that one can separate skill from content in the development of communi
cative competence. It is a fallacy to believe, for example, that an uneducated
language learner will be successful in achieving the lofty goals of the US Presi
dent's Commission and the cultural goals of the US states' guidelines without
additional education. Rather, a cross-cultural approach should abandon the na
tive speaker as ideal or norm and focus instead on developing the learner's bi-
culturalism, as it does its bilingualism. It should, therefore, instill basic language
236 CLAIRE KRAMSCH
courses with the intellectual excitement generated by forms of learning that are
typically attributed to general education and academic achievement: relational
and critical thinking, observation of and reflection on interactional processes, in
terpretive ability (Kramsch 1987a, 1987b; Swaffar 1990).
Finally, given the administrative structure of higher education in the United
States and the traditional differential of prestige between teachers of literature
and teachers of language, one question has gained in importance in the last few
years: How can we break the hammerlock of humanists on the teaching of
foreign languages? or: Should the teaching of language be coupled to that of lit
erature or not?
Although this question is justified in view of certain academic excesses, it is
nevertheless of too limited a scope to offer a useful response. The issue is not:
By whom or in which domain of knowledge should languages be taught, but
what kind of discourse worlds should be activated and how? Literature is but
one of the many cultural discourses to which foreign language learners should
be exposed to; others include everyday conversational discourse, scientific, tech
nical and political discourse, and the specific discourse of individual disciplines.
To the extent that they are the products of a given culture, works of foreign lit
erature read by foreign non-intended readers present a special challenge in
cross-cultural communication, that eminently serves to further cross-cultural
education (Wierlacher 1985; Bredella and Haack 1988; Kramsch 1988a). Lan
guage study should expose learners to a variety of discourse forms that coexist in
a given culture (Modern Language Association 1989).
5 Conclusion
This paper started reviewing different educational traditions in the teaching

of language and culture in an attempt to break some stereotypical misconcep
tions about the nature and role of culture in foreign language education. We
realized that even if some educational systems lay more emphasis on perfor
mance or on competence, cultural performance is inseparable from cultural
competence and both are linked to the use of language in discourse. Current ef
forts in the United States are directed at linking the teaching of language to that
of culture. In so doing, it faces major possibilities and obstacles: political incen
tives, advances in computer and video technologies as well as insights gained
from artificial intelligence, all open up possibilities for bringing the outside
world into the classroom and for teaching culture in a multidimensional, authen
tic way. However, at the same time, they risk reducing the concepts of language
and culture to positivistic, information-processing models that only thinly con

ceal age-old ethnocentric biases.
A proposal is made here to develop an cross-cultural approach to the teach
ing of linguaculture at all levels and in all aspects of the curriculum. This ap
proach takes discourse as the integrating moment where culture is viewed, not
merely as behaviors to be acquired or facts to be learned, but as a world view to
be discovered in the language itself and in the interaction of interlocutors that
use that language.
Notes
1. The different states' guidelines consulted here are as follows: Foreign Languages Arkansas
Public School Course Content Guide. Little Rock, Arkansas: State Board of Education,
1984; Handbook for Planning an Effective Foreign Language Program. Sacramento, Califor-
nia: California State Department of Education 1985; A Guide to Curriculum Development
in Foreign Languages. Hartford, CT: Connecticut State Board of Education 1981; French
Language Program Guide. Honolulu, Hawaii: Department of Education, Office of Instruc-
tional Services/General Education Branch, Feb. 1979; Designing, Strengthening and Assess-
ing School Foreign Language Programs. A Guideline for Administrators and Teachers.
Bloomington, IN: Indiana Dept of Public Instruction, Division of Curriculum 1981; Ken-
tucky FL/ESL Skills Continuum Frankfort, KY: Foreign Language Education, Kentucky
Dept of Education 1980; Position Paper on Foreign Language Education in Michigan
Schools. Detroit: Michigan State Board of Education 1983; Modern Languages for Com-
munication. New York State Syllabus. Albany, NY: New York State Education Dept 1985;
Suggested Learner Outcomes. French. Oklahoma City: Oklahoma State Dept of Education,
August 1985; Handbook for Foreign Language Educators, Harrisburg, PA: Pennsylvania
Dept of Education 1983; Foreign Language Dept Goal Statement. Materials Guide. Level I
materials. Springfield, MA: Springfield Public Schools 1986; Secondary French Guidelines
for Levels I, II, III. Austin, TX: Foreign Language Section, Texas Educational Agency, Divi-
sion of Curriculum Development 1978; A Course of Study for Foreign Languages in Utah.
Salt Lake City, Utah: Utah State Office of Education, Division of Curriculum and Instruc-
tion 1980; Foreign Languages in Virginia Schools. Richmond, VA: Foreign Language Ser-
vice, Dept of Education. Sept 1977, vol. 1-7; A Guide to Curriculum Planning in Foreign
Language. Madison, WI: Wisconsin Dept of Public Instruction 1985; Instructions ministé-
rielles pour l'enseignement des langues vivantes. Journal officiel 524-6,6, 1986; Rahmenrich-
tlinien, Sekundarstufe 1, Neue Sprachen. Der Hessische Kultusminister (Ed.) Frankfurt
a.M.: Diesterweg 1980.
I am grateful to Laure Borgomano for making available to me her collection of States Gui-
delines.
2. Other U.S. initiatives worth mentioning are the Lauder Institute of Management and Inter-
national Studies, especially their Language and Cultural Perspectives Program at the
University of Pennsylvania. Besides professional proficiency in a foreign language, this pro-
gram provides substantial knowledge of contemporary and traditional culture of educated
native speakers of that language; history, economics, geography, literature, political science
238 CLAIRE KRAMSCH
and philosophy, and religion, as well as the arts, media, and sports are taught in the foreign
language; students are given also an understanding of management communication style, be
havior, and cultural protocol in a range of professional and social settings. Another trend-
setting initiative is the Integrative German Studies program at the University of Tübingen
sponsored by the Bosch foundation for graduate students and young scholars from the
United States. This is an interdisciplinary research project with the intention of developing
comparative and interdisciplinary German Studies for Americans. The first seminar run for
students from UCLA took place in summer 1988.
3. At the same time, the Los Angeles Times published an article entitled "Security Threat
Cited in Foreigner Jobs", where the "security threat" posed by the influx in foreign engineers
referred to the dangers inherent in foreign intellectual styles. "Some experts worry that the
traditional American emphasis on practical engineering problems may be eroding in favor of
theoretical engineering sciences, a more prestigious pursuit but one less likely to contribute
to American competitiveness in world markets" (Gillette 1988).
References
Abelson, H. and G.J. Sussman. 1985. Structure and Interpretation of Computer Programs. Cam-
bridge: MIT Press.
ACTFL Proficiency Guidelines. 1986. Hastings-on-Hudson, NY: ACTFL Materials Center.
AATF National Bulletin. 1989. "The Teaching of French. A Syllabus of Competence." AATF Na-
tional Bulletin 15. Special issue.
Attinasi, J. and P. Friedrich. 1988. "Dialogic Breakthrough: Catalysis and Synthesis in Life-chang
ing Dialogue." Unpublished manuscript.
Bateson, G. 1982. Steps to an ecology of mind. New York: Bantam.
Bourdieu, P. 1967. "Systems of education and systems of thought." International Social Science
Journal (Unesco) 19/3.338-58.
Bourdieu, P. 1982. Ce que parler veut dire. L'économie des échanges linguistiques. Paris: Fayard.
Bredella, L. and D. Haack, eds. 1988. Perceptions and Misperceptions: The United States and Ger-
many. Studies in Intercultural Understanding. Tübingen: Gunter Narr Verlag.
Coste, D. 1980. "Analyse de discourse et pragmatique de la parole dans quelques usages d'une di
dactique des langues." Applied Linguistics 1/3.244-252.
Fishman, J. 1982. "The Need for Language Planning in the United States." PEALS 20/4.5-6. (Pub
lished by the Colorado Congress of Foreign Language Teachers, a constituent of ACTFL.)
Fiske J. and J. Hartley. 1978. Reading Television. London: Methuen.
Fliegel, D. 1987. "Immigrant Professionals must speak American." The Boston Globe June 16.
Galisson, R. 1987. "Accéder à la culture partagée par l'entremise des mots a C.C.P." Etudes de
Linguistique Appliquée 67.119-40.
Galtung, J. 1985. "Struktur, Kultur und intellektueller Stil." Wierlacher 1985.151-193.
Gambhir, S. et al. 1987. New Directions New People A Video Series for Teaching Hindi as a
Foreign Language. Available from South Asia Regional Studies, U. of Pennsylvania, Philad
elphia, PA 19104-6305.
Gamhbir, V. Forthcoming. "A set of culturally sensitive situations for South Asian languages."
Available from Dept of Modern Languages and Linguistics, Cornell University, Ithaca, NY
14853.
Garfinkel, H. 1967. Studies in Ethnomethodology. London: Basil Blackwell, 301-323.
Geisler, M. 1985. "'Heimat' and the German Left: The Anamnesis of a Trauma" New German
Critique 36.25-66.
Gillette, R. 1988. "Threat to Security Cited in Rise of Foreign Engineers." Los Angeles Times
January 20.
Gumpert G. and R. Cathcart, eds. 1982. Inter/Media. Interpersonal Communication in a Media
World. New York: Oxford University Press.
Gumperz, J J, T.C. Jupp and C. Roberts. 1979. Cross-talk: A Study of Cross-Cultural Communica-
tion. London: National Centre for Industrial Language Training in association with the BBC.
Habermas, J. 1970. Theorie des kommunikativen Handelns. Frankfurt/Main: Suhrkamp.
Hart D., S. Lapkin and M. Swain. 1987. "Communicative Language Tests: Perks and Perils."
Evaluation and Research in Education 1/2.83-94.
Hofstadter, R. 1963. Anti-intellectualism in American Life. New York: Vintage Books.
Jorden, Eleanor H. and Mari Noda. 1989. Japanese: The Spoken Language. Videotapes of core
conversations. Available from Sales Dept, Sony Video Software, 1700 Broadway, New York,
NY 10019.
Kozloff, S.R. 1987. "Narrative Theory and Television." Channels of Discourse. Television and Con-
temporary Criticism ed. by R.C. Allen. Chapel Hill: University of North Carolina Press.
Kramsch, C.J. 1983. "Culture and Constructs: Communicating Attitudes and Values in the
Foreign Language Classroom." Foreign Language Annals 16.437-448.
Kramsch, C.J. 1987a. "New Directions in the Teaching of Foreign Languages." The Governance of
Foreign Language Teaching and Learning. Proceeding of a Symposium Princeton, New Jersey,
October 1987. ed. by P. Patrikis. New Haven, CT: The Consortium for Language Teaching
and Learning.
Kramsch, C.J. 1987b. "Socialization and Literacy in a Foreign Language: Learning Through Inter
action." Theory into Practice (Ohio State University) 26/4.243-50.
Kramsch, C.J. 1987c. The Missing Link in Vision and Governance: Foreign Language Acquisition
Research. ( = Profession 87.) New York: The Modern Language Association of America.
Kramsch, C.J. 1988. "The Cultural Discourse of Foreign Language Textbooks." Towards an New
Integration of Language and Culture ed. by Alan Singerman Middlebury, VT: Northeast Con
ference.
Lambert, Richard D. 1989. International Studies and the Undergraduate. Washington, DC: Ameri
can Council on Education.
McLeod, B. 1976. "The relevance of anthropology to language teaching." TESOL Quarterly
10/2.211-220.
Modern Language Association. 1989. "Language Study in the United States. A Draft Statement."
MLA Newsletter Fall 1989.16
Müller, B.D. 1980. "Zur Logik interkultureller Verstehensprobleme." Jahrbuch Deutsch als
Fremdsprache 6.102-119.
Müller, B.D. 1981. "Bedeutungserwerb. Ein Lernprozeß in Etappen." Konfrontative Semantik ed.
by B.D. Müller, 113-154. Tübingen: Gunter Narr Verlag.
Murray, J.H., G. Furstenberg and D. Morgenstern. 1989. "The Athena Language Learning Pro
ject: Design Issues for the Next Generation of Computer-Based Language Learning" Mod-
em Technology in Foreign Language Education: Application and Projects ed. by W. Flint
Smith, 97-118. Lincolnwood, IL: National Textbook Co.
Ong, W. 1977. Interfaces of the Word: Studies in the Evolution of Consciousness and Culture. Itha-
ca, NY: Cornell University Press.
240 CLAIRE KRAMSCH
Perkins, J. 1980. "Strength through Wisdom: A Critique of U.S. Capability. A Report to the Presi-
dent from the President's Commission on Foreign Languages and International Studies, No-
vember 1979." Modern Language Journal 64.9-57.
Pfeiffer, K.L. 1988. "Implications of the Intellectual Migration: Two Cultures Once Again?" Bre-
della and Haack 1988.37-59.
Porcher, L. 1983. "L'école dans tous ses états I. A la recherche de 'modèles' pédagogiques." Le
Français dans le Monde 179.25-29.
Reddy, M. 1979. "The Conduit Metaphor." Metaphor and Thought ed. by A. Ortony. Cambridge:
Smith, F. 1985. "A Metaphor for Literacy: Creating Worlds or Shunting Information?" Literacy,
Language and Learning. The Nature and Consequences of Reading and Writing ed. by D.
Olson, N. Torrance and A. Hildyard. Cambridge: Cambridge University Press.
Snow, C. 1987. "Beyond Conversation: Second Language Learners' Acquisition of Description
and Explanation." Research in Second Language Learning: Focus on the Classroom ed. by
J.P. Lantolf and A. Labarca, 3-16. Norwood, NJ: Ablex.
Swaffar, Janet. 1990. "Language learning is more than learning language: Rethinking reading and
writing tasks in textbooks for beginning language study." Foreign Language Research and the
Classroom ed. by B.Freed. Lexington: D.C. Heath.
Wierlacher, A., ed. 1985. Das Fremde und das Eigene. Prolegomena zu einer interkulturellen Ger-
manistik. Munich: Judicium Verlag.
Implications of Intelligent Tutoring Systems for
Research and Practice in Foreign Language
Learning
Ralph B. Ginsberg
In this and two subsequent papers I shall explore some implications of arti
ficial intelligence for the design and empirical analysis of learning environments
and teaching systems for foreign languages. At the same time, since artificial in
telligence is implemented on computers or in settings in which computers play a
fundamental role (such as hypermedia with interactive video and voice syn
thesis), I shall be discussing foreign language learning that takes place in de
signed environments that are for the most part neither classroom-based nor
classroom-managed. The issues with which I shall be concerned here, however,
do not depend in any important way on the technology of computers or class
rooms, and accordingly the research I shall review is pertinent to traditional
teaching methods as well as computer-based learning.
Computers are, in one respect, powerful tools which vastly increase our ca
pacity to perform logical, numerical, and symbolic computations. Most of the
computer aided instruction (CAI) that is now commonplace in virtually every
area of education uses them in this way. But in another respect computers are an
interactive and potentially intelligent medium within which we can carry out our
most important social and cognitive activities. Artificial intelligence (AI) is the
branch of computer science which tries to exploit this intelligence and interac
tiveness. With regard to the transmission and acquisition of knowledge, this en
tails addressing two basic questions:
242 RALPH B.GINSBERG
- Learning environments: what are the characteristics of the physical and so
cial settings in which people learn efficiently and effectively, and how can
such settings be constructed or simulated?
- Knowledge communication: how is knowledge successfully communicated
and skill successfully imparted, and how can those communication proces
ses be emulated and enhanced?
In order to build computational models addressing these issues, AI re

searchers have had to look deeply into the nature of knowledge, learning, and
teaching, in the light of what is now possible with new computer-based techno
logies. Moreover, building functional intelligent programs or operational inter
active environments requires that issues in the design of learning environments
and knowledge communication be addressed in considerable specificity and de
tail. In this second respect, as a cognitive science whose models are heavily in
fluenced by theories of computation, AI's implications for education are more
radical than those of "traditional" CAI, going well beyond the provision of tools
to support current practice (Papert 1980; Brown and Greeno 1984; Schank 1984;
Pea and Soloway 1987; and, for a cautionary note, Winograd and Flores 1985).
Thus, quite apart from the ultimate success of its artifacts, the process of AI re
search and development holds the promise of deepening our understanding, and
hence of reshaping practice, in and out of the classroom. Indeed the theory and
analysis that lie behind attempts to build AI programs are already having a
major indirect influence on education through the nascent field of cognitive
science, where, along with linguistics and psychology, AI is arguably first among
equals (Haugeland 1984).
Drill-and-practice tutorial programs for foreign language learning go back
to the earliest days of CAI (see e.g. Suppes 1981). The lively interest in CALICO
and similar organizations points to continued growth and diversity. With the ex
ponential decrease in the cost of computing power, the development of powerful
information retrieval methods and optical disk storage, with computer driven
hypermedia systems including audio, CD-ROM, and voice recognition available
or just around the corner, with the increasing accessibility of multilingual word-
processing software, computer graphics, and desktop publishing running on
multitask workstations, the prospects are good for a new generation of innova
tive and exciting developments. In contrast to "traditional" CAI and these other
newly emerging technologies, foreign language learning has not figured in any
significant way in applications of artificial intelligence to education, the bulk of
attention, for one reason or another, being focused on science, mathematics,
programming, and engineering. In the last few years some interesting prototypes
using AI techniques and technologies for foreign language learning have been
IMPLICATIONS OF INTELLIGENT TUTORING SYSTEMS 243
developed - Project ATHENA at MIT (Murray et al. 1987; Morgenstern 1986);

the CALLE project at Xerox (Feuerman et al. 1987; Xerox 1985); the LINGER
programs of Yazdani, Barchan and their colleagues at Exeter (Barchan 1987;
Barchan et al. 1986; Barchan and Wusterman 1988; Yazdani 1988); and Cerri
and Breuker's (1981) DART in the PLATO system - and several other projects
are currently being planned. But I think it would be fair to say that these pro
grams do not address the main issues raised in this paper, especially those re
lated to learning and intelligent tutoring. Nevertheless, I shall argue that
applications of AI to foreign language learning are feasible, and that the poten
tial impact of AI is quite as great as that of any of the new hardware techno
logies, because, directly or indirectly, AI can enable us to use these technologies
more effectively.
This paper aims to give the reader a general overview of the character and
motivation of AI approaches to education and learning. Its emphasis is on spec
ifying how the key issues have been framed, sketching the theoretical underpin
nings and main conceptualizations used by AI researchers, and surveying the
alternative ways in which the issues have been addressed in practice. Although
at various points, especially in section 2, applications to foreign language learn
ing are suggested, detailed development of the implications is left for the two
succeeding papers, the one dealing with strategic areas of research and develop
ment for foreign language instruction in computer-based environments, and the
other dealing with the methodology of empirical research on foreign language
learning in instructional settings.
The paper is organized as follows. The next section, motivated by the two
basic questions raised above, introduces AI programs designed to emulate good
teaching, known as "intelligent tutoring systems" (ITS), and computer-based
"microworlds", designed to be effective learning environments. To get a better
idea of what is different about AI, both are compared with their more familiar
cousins, CAI and simulation. In section 2 the main components of ITS architec
ture, which are a convenient way of organizing the issues of learning and teach
ing, are discussed. These include domain expertise, student modelling and
diagnosis, teaching strategies, planning and control, task structure, and the lear
ner-computer interface. Section 3 is concerned with how learning has been con
ceptualized by AI researchers, and with the design principles that derive from
that. I shall not try to make a summary judgment about how AI will affect
foreign language learning: that would be premature, and in any case it is a com
plex question that turns on many factors besides the merits of the tools. Rather,
I conclude by flagging some of the main themes concerning research and in
structional design in this general review.
244 RALPH B. GINSBERG
1 Intelligent tutoring systems, learning environments, and CAI
1.1 Overview and Examples
Applications of artificial intelligence to education (broadly conceived to in

clude the learning of both cognitive and procedural skills, in and out of schools)
cover a very wide range of subject matters, goals and instructional styles. Even a
simple classification is difficult, not to speak of a useful definition of "intel
ligence". Perhaps the most celebrated and controversial application of AI to
education, LOGO, does not have any evident "artificial intelligence" at all, al
though, as its designer, Seymour Papert, has stressed (1980, Ch.7), it is pervaded
by AI conceptualizations and tools. In this paper I shall focus on what have come
to be known as "intelligent tutoring systems" (ITS) — I prefer the term to the
more usual "intelligent computer aided instruction" (ICAI) because of the often
unwarranted invidious connotations of the latter — i.e. computer-based learning
which in some sense is designed to emulate good teaching. (I shall, however, dis
cuss computer-based "microworlds", like LOGO, where there is no tutorial in
tervention, briefly in section 1.3.) ITSs operate in a middle ground between
completely learner directed, highly contextualized "natural" learning environ
ments, on the one hand and strongly teacher directed, decontextualized class
rooms, on the other. They draw their inspiration from and illuminate important
characteristics of both of these extremes.
The aim of intelligent tutoring systems, as Anderson, Boyle and Reiser
(1985) have succinctly put it, is to "provide the student with the same instruc
tional advantage that a sophisticated human tutor can provide". They are intelli
gent, to the extent that they are successful, in the sense that their behavior would
be recognized as "good teaching", i.e in the sense that they can emulate that
complex, intelligent human behavior. ITS originated in the early 1970's in the
work of Carbonell (1970) as an attempt to overcome some of the limitations and
rigidities inherent in what was then and still is, the dominant form of computer
instruction (CAI), viz. a set of stored textual presentations, explanations, exer
cises, responses to student input etc., often called "frames", presented to the stu
dent according to predefined branching rules which incorporate a definition of
the curriculum, ideas about tutoring and remediation, and anticipated student
responses. Although the behavior of such programs as perceived by the student
is often not that different from ITS (Lewis, Milson and Anderson 1987), ITS and
CAI differ fundamentally in the way the knowledge they possess is represented
and in the way the components of the program are put together. I shall return to
the contrast between ITS and CAI in the next subsection.
Since the pioneering work of Carbonell a number of intelligent tutoring sys

tems have been built. Their teaching methods include help facilities that try to
figure out what the learner really wants to know, coaching, socratic and case
method dialogues, and highly structured and directive tutoring. Most have re
mained as prototypes, partly because of the great expense involved in building
them (see Pea and Soloway 1987, for estimates and comparisons with the cost of
CAI; and Johnson 1987, for pragmatic considerations limiting the building of
programs), and partly because they were primarily designed for research pur
poses in the first place. It is the results of this research — the evolving formula
tion of the key issues to be addressed, the range of methods developed to
resolve them, and the experience gained from successes and failures along the
way — rather than a "bottom line" evaluation of how successful they have been in
changing the educational outcomes in existing school and training settings, that
are of primary interest for researchers and designers of foreign language instruc
tion. There are, however, several programs currently in use, including:
- STEAMER (Hollan et al. 1984), which makes use of computer graphics and
simulation to help students learn the operation of shipboard steam propul
sion systems;
- SOPHIE (Burton, Brown and De Kleer 1982), designed to teach various
aspects of electronic troubleshooting;
- WEST (Burton and Brown 1982) and WUSOR (Goldstein 1982), coaches
for the computer-based games WEST ("How the West Was Won") and
WUMPUS;
- GUIDON (Clancey 1987), designed to teach medical diagnosis through
case method dialogues; and
- LISP and Geometry tutors (Anderson and Reiser 1985; Anderson, Boyle
and Yost 1985), for the AI computer language LISP and high school geome
try.
- Several other programs are currently in production.
Work in ITS has progressed at a steady pace. The state-of-the-art in 1982

was summarized in an important book edited by Sleeman and Brown, where
most of the work to that date on both sides of the Atlantic was represented. In
the last year there has been an explosion of books and articles in the area: two
thoughtful and comprehensive books reviewing the field (Wenger 1987; and
Poison and Richardson 1988), three edited volumes (Kearsley 1987; Psotka,
Massey and Mutter 1988; Mandl and Lesgold 1988), and several useful review
articles (e.g. Anderson, Boyle and Reiser 1985; Olsson 1986; Dede, Zodhiates
and Thompson 1985), with many more in press.
1.2 ITS and CAI
The goals and aspirations of ITS can, perhaps, best be grasped by contrast
ing it with the familiar CAI programs it is meant to improve. ITS's improve
ments move it closer to successful human teachers and supportive learning
environments, in and out of the classroom, and in this respect they bear on
general issues of research in foreign language learning.
One potentially important drawback of CAI - one should not exaggerate
how important this or the other drawbacks noted below really are for any par
ticular pedagogical goal or subject matter - is its rigidity. "Traditional" CAI (the
quotation marks signal caricature of both CAI and ITS) can be thought of as a
directed graph or flowchart, consisting of nodes representing textual, graphical
and audiovisual presentations of material, menus, questions with their answer
categories, error messages and tutorial explanations etc.; and a set of links con
necting each node to the program's next actions (presentations). The links em
body a detailed specification of the flow of control, as determined by student
responses at the originating node or by some other, prespecified branching
mechanism. The course author, perhaps assisted by an authoring program, must
explicitly enter all of the nodes and specify all of the links (student responses
and tutorial reactions). A session then consists of one of the possible paths
through the graph. Such a scheme is satisfactory if the student only needs drill
and practice, exercises, and an occasional tutorial. But it puts a great burden on
the designer to anticipate all contingencies, a burden that is difficult to bear for
rich, nonmechanical activities, like using a foreign language. By contrast, like
human teachers, ITS tries to plan sessions with the student on the fly (Peachey
and MaCalla 1986):
- CAI requires a detailed predetermined course graph structure, while ITS

constructs it dynamically as the session progresses;
- the fixed branching in CAI must be chosen with a wide range of students in
mind, while ITS teaching plans and branching actions are tailored to each
student;
- CAI must anticipate all relevant events, while ITS can revise its plan and
start from an incomplete specification;
- CAI's characterization of the student is one of degree of competency obtai
ned, and its anaylsis of student responses is based simply on answer match
ing, while ITS can interact with a much more subtle model of the student.
ITS, then, tries to respond more flexibly than CAI. But ITS also tries to
teach more complex skills, and this has important consequences for how it must
be designed. For simple skills and highly structured, low level tasks, responses
can be predicted and paths constrained. But for more complex skills and more
integrative tasks (e.g. mathematics problem solving, writing, or carrying on a
conversation in a foreign language) students must play a more active role in the
process, experimenting with various aspects of the domain and determining their
own courses of action. Such student behavior may well be impossible to antici
pate in detail. Thus, if the tutor is to be helpful, it must, like human tutors, be
able to solve problems as they arise, i.e. it must have its own knowledge of the
subject matter and be able to put it to use. Moreover, with complex skills, in
ferences from what the learner does to what he knows (can do) and why he does
it are not at all straightforward. Expert knowledge of the subject matter and
more complex, sensitive representations of the student are a second major dif
ference between ITS and CAI.
A third, more subtle difference has to do with the way knowledge of the
subject matter is represented in ITS and CAI. In CAI the knowledge is con
tained in the procedures, i.e. the branching rules, which contain the possible
answers to exercises and drills (including the correct one) as conditions and the
CAI responses as actions. Moreover in the branching rules teaching knowledge
and subject matter knowledge (condition and action) are tied strongly together.
ITS programming structures follow a different strategy which has proved suc
cessful in other AI applications (e.g. expert systems, computer vision, and natu
ral language processing). First, the teaching and domain knowledge are
separated into different modules — so that each can be independently modified
and so that they can be more flexibly combined as the planning mechanisms re
quire. Second, within the teaching and domain modules, expert knowledge is
often represented "declaratively", as a set of facts and rules in a knowledge base,
which can be modified independently of the procedures that use them; inference
rules and other computational mechanisms are then provided to access the
knowledge base and put it to use. It is from the representation of expert knowl
edge and the separation of teaching and subject matter knowledge that ITS
derives its power, and indeed these feature are what makes ITS possible.1
A final difference between ITS and CAI has to do with the kinds of things
they try to teach and the instructional principles they use to do so. Largely
through the influence of such ITS researchers as John Seely Brown, Richard
Burton and Allan Collins, and the seminal, closely related work of Seymour
Papert, our views of what does and should go on in classrooms or on computers,
and how the learning process should be organized, are being transformed. In
particular (see Pea and Soloway 1987) a fact-oriented, classroom-based and
classroom-managed "transmission view of knowledge", in which the major peda
gogical activity is the presentation of well-structured material to be learned
through lecture, demonstration, recitation, drill and practice-the setting and

ethos of CAI! —is giving way to a more learner-centered view, in which an active
learner, using prior understandings ("frames", "schemas", and "mental models")
and a variety of domain-specific and general strategies, acquires knowledge in
contexts which are isomorphic to the situations in which that knowledge will be
eventually be used (Collins, Brown and Newman 1987; Papert 1980; and a vast,
often polemical literature). This "situated" view of knowledge and learning,
which draws its "success models" from such settings as apprenticeship, collabor
ative work, and games (IRL 1988)-and which evokes debates about "learning
vs. acquisition" and "immersion vs. drill and practice" in foreign language peda
gogy—is at the heart of many ITS applications, and has clear implications for
foreign language learning.
1.3 Learning Environments and Microworlds
Although this paper is primarily concerned with ITS, it would be useful to

digress somewhat at this point to consider another important line of applications
of AI ideas and techniques to education, the design and construction of com
puter-based microworlds. Microworlds intersect ITS in the "environment" and
"interface" components of ITS architecture, to be discussed in section 2.6. They
are of interest for foreign language learning not only in their own right, as anal
ogs of "authentic" materials and content-based learning, but also because of the
insights into learning in "natural" settings that have been generated by their de
signers, especially Papert (1980).
It is hard to give a crisp definition of the concept of a "microworld". The
meaning of the term lies more in the intentions and goals of the designers and in
how microworlds are constructed than in what it describes. Pea's (1987) defini
tion captures much of current usage: "A microworld is a structured environment
that allows the learner to explore and manipulate a rule-governed universe, sub
ject to specific assumptions and constraints, that serves as an analogical repre
sentation of some aspects of the natural world". The best known, but by no
means the only examples of computer-based microworlds are the LOGO pro
gramming language (Papert 1980) and its extensions to "turtle" geometry (Abel-
son and DiSessa 1980) and more recently to Lego-LOGO (Papert 1986). LOGO
is now widely used in schools in the US and the UK, and it has generated consid
erable theoretical and empirical research (see e.g. Yazdani 1984; O'Shea and
Self 1983; Lawler and Yazdani 1987).
As mentioned above the role of artificial intelligence is not obvious as a
child programs in LOGO. Nor is AI obvious in the graphics and simulation
based ITS STEAMER where learning sessions are largely directed by the lear
ner and the computer itself does not seem to do anything that could be called in
telligent. But as Papert (1980, Ch.7) and Hollan, Hutchins and Weitzman (1984)
stress, AI is in fact fundamental to the design and construction of both systems.
First of all the LOGO language itself is a variant of the AI programming lan
guage LISP; STEAMER'S graphic editors and inspectable simulations are based
on LISP and AI object-oriented programming techniques. Neither would be
possible without the power of these languages. Secondly in both STEAMER and
LOGO the interface through which the learner interacts with the computer
draws heavily on the interactive programming environments developed for AI,
and again neither would be possible without it. But most importantly the micro-
worlds themselves derive from an analysis of how people reason about geometry
or dynamic systems, an analysis of the knowledge that is to be learned, and an
analysis of the way that knowledge has to be represented if learning is to be ef
fective. Such essentially cognitve analyses, and the computational models that
implement them, are AI's greatest achievements and promise. The resulting
knowledge representations are built into the fabric of the microworlds — into the
objects the learner can manipulate, the physical set up, her possibilities for ac
tion, the tasks she can perform, the tools available to do them. Microworlds are
often engaging because they are "realistic" and apparently relevant, as are simu
lations and "authentic" materials. They are also, like games and other informal
learning environments, fun and self-motivating. But it is the cognitive analysis
leading to an implicit presentation of knowledge by structuring the possibilities
to be explored by the learner (as contrasted with knowledge presented through
explanations or expository texts), not the motivational effects, that distinguishes
the AI based applications from "traditional" approaches to education.
The educational philosophy lying behind the construction of Papert's math
ematics and physics microworlds is one of "discovery" learning, of learning-by-
doing, of giving the learner the opportunity and tools to learn by himself, rather
than trying to teach him. While Papert's arguments are, in my view, compelling,
simply letting a learner explore a microworld — or a complex simulation, or a
"natural" environment in which learning might take place, for that matter —
without any "guidance" whatever, has several limitations which should be noted:
- learners may form grossly incorrect models (conceptions) of the domain

they are to "learn";
- learners do not know the cause of their errors, or even when they make
them, so errors are nonconstructive;
- learners may not explore all of interesting parts of the microworld, getting
stuck in a small subworld;
- learners may not get into fine structure, even if they have mastered the
grosser features;
- learners may not explore the microworld effectively, e.g. cycling in an incor
rect procedure or fixating on complex problems before their simpler com
ponents have been encountered;
- learners may learn slowly, spending a lot of time on irrelevant activities;
- learners may not see the context of the simplifications in the microworld or
the limitations of the specific tasks or setting.
These reservations argue for some form of instructional intervention or

guidance, and even give some clues as to the most important issues in choosing
what form that guidance should take. The key questions are empirical:
- how do people learn? and

- how can that learning be enhanced?
Considerable progress has been made for several domains, notably mathe
matics, physics and engineering, where microworlds have been designed. I shall
return to these questions when "increasingly complex microworlds" (Burton,
Brown and Fischer 1984) and the Collins-Brown-Newman (1987) framework are
discussed in section 3.2.
2 The architecture of ITS
Intelligent tutors as described in the section 1.1 and 1.2 are comprised of
seven interdependent "architectural" components which perform the various
functions necessary for teaching. The architecture can also serve as a convenient
way of organizing discussions of the human tutors, coaches, and even classroom
instructors that the ITSs are meant to emulate, since they, of course, must meet
these functions as well. The components are:
- domain knowledge, i.e expert knowledge of the subject matter to be taught,

which the ITS uses to solve problems, generate explanations etc.;
- a student model, i.e. a representation of the student's knowledge of the do
main, often including a history of the student's responses;
- diagnostic methods for updating the student model as a function of the stu
dent's responses;
- teaching knowledge, i.e. knowledge of how to teach effectively in particular
circumstances, along with a set of teaching tactics for accomplishing that;
- planning and control mechanisms which, on the basis of the current state of
the student model and the domain and teaching knowledge, determine what
to do;
- an environment, i.e. a set of tasks (activities) that the learner is to perform,
and the tools given him to do so; and
- an interface by which the tutor and the student communicate.
ITSs differ in how these components are actually constructed, how they are
interrelated, and what emphasis is given to each in the research and develop
ment process. Still, to build functioning prototypes on a computer, ITS designers
and researchers have had to be very specific about how each is to be im
plemented. Although for expository purposes the components are treated as if
they were separate modules, they must to some extent be designed and evalu
ated concurrently, since after all they must work together to produce effective
learning. Whether or not it would be possible, or even desirable, to replicate the
work I shall review by building an ITS for foreign language instruction, it is my
contention that considering how the issues have been addressed by ITS re
searchers helps define the interesting research and instructional design issues
for the field. In this section I discuss the components of ITS architecture and re
view some of the approaches ITS researchers have taken in each.
2.1 Domain Knowledge
As noted in section 1.2, in order for an ITS to provide flexible instruction

and respond to arbitrary and unanticipated student responses, it must include an
"expert system" which is able to solve problems in the domain that it is teaching.
For the most ambitious foreign language tutor this would require a program that
could understand natural language in very unstructured settings, a remote goal
given the current state of the art in AI. For beginning and intermediate lessons
(Feuerman, Marshall, Newman and Rypa 1985; Xerox 1985), or for learning
sublanguages in structured technical domains (Geesey et al. 1989), however, the
requisite domain expertise is currently within reach.
The expert knowledge components of existing ITS fall into three broad
groups (Anderson 1988). This first consists of "black box" experts, complex algo
rithms or simulations which can solve the problems but in which knowledge is
not explicitly represented and the processes of solution are not useful for in
struction. McArthur's algebra tutor (McArthur 1986; McArthur et al. 1987),
which uses the algebraic programming system REDUCE; the gaming tutors
WUSOR (Goldstein 1982) and WEST (Burton and Brown 1982), which contain
algorithms to compute the optimal move for any given position; and SOPHIE-I
(Burton, Brown and De Kleer 1982), which uses a general purpose circuit simu
lator, are examples. In language instruction most current parsers would fall into
this class. At the other extreme are "glass box" experts (Goldstein and Papert
1977), elaborate cognitive models and qualitative process models which repre
sent the domain knowledge and reason about it in the same way that human
beings do, so that describing and observing the expert's problem solving process
would be a useful component of the instruction. Anderson's LISP and Geometry
tutors (Anderson and Reiser 1985; Anderson, Boyle and Yost 1985) and SO-
PHIE-III (Brown and Burton 1987) are examples. It is hard to imagine the
analog of these models for foreign language instruction, since a detailed, "psy
chologically real" theory of language production and understanding would be re
quired, but for some narrowly circumscribed tasks computational models might
be possible. In between are expert systems, developed using knowledge engin
eering techiques, which represent knowledge in an explicit way but may not rea
son about it the way human beings do. (While they may have the same
knowledge base as cognitive models, they manipulate it differently.) A classic
example is GUIDON (Clancey 1986b, 1987), whose expert system consists of a
knowledge base containing facts and relations in its domain (bacterial infec
tions), knowledge of diagnostic techniques, and a set of procedures (inference
rules and interpretations) which use the knowledge base and techniques to diag
nose diseases.
While it is not crucial how domain problems are solved, there are, neverthe
less, problems that have to be addressed in one way or another if the domain
knowledge is to be useful for tutoring. In order to help students, tutors have to
be able to explain how the correct answer is derived and guide the students
along a path to it. Glass box experts already do this. The evolution of the GUI
DON case method tutors from classical expert system to a more human-like
model (Clancey 1986b) was motivated by just such considerations. Black box
and classical expert systems in ITS have been made more "articulate" by
augmenting them with devices like Burton and Brown's (1982) "issues recogni
zers", or Clancey's (1987) "t-rules", which compare student and expert perfor
mance (however generated) and base tutorial actions on differences ("issues")
they are designed to detect. In this regard issue recognizers automate some as
pects of CAI branching, although in order to do so effectively they may need
more information than the match between correct and incorrect answers.
Besides articulateness, tutoring imposes other demands on the domain
knowledge component. A good tutor needs some knowledge of how incorrect or
partial answers are generated if it is to help students get past these blocks. Many
tutors represent incorrect knowledge in the form of "false facts", "mal-rules",
and "buggy" procedures, to deal with this problem. In foreign language the
CALLE tutor (Feuerman, Marshall, Newman and Rypa 1985; Xerox 1985),
which uses the LFG processor developed by Kaplan (Kaplan and Bresnan 1982),
has this happy property of being able to account for ungrammatical as well as
grammatical responses. The problem of knowledge representation is further
compounded by the fact that in many domains there are many correct responses
and many ways to reach a given correct response, each containing useful peda
gogical information. A single algorithm or expert system may not be sufficient to
capture this variation; multiple representations and views of the expert knowl
edge may be required (Olsson 1986). Thus, although to a large extent domain
knowledge and teaching knowledge are separated in ITS, there are practical re
strictions on the modularity of the system.
2.2 The Student Model
The maintenance of a nontrivial student model, i.e. a fairly detailed repre

sentation of what the student "knows", is another important property that ITS
shares with successful human tutors. Student modelling is a form of cognitive
modelling as practiced in cognitive science generally, and a sibling of user mod
elling in computer science. Although all three share many techniques and
strategies, unique issues arise because of the demands of learning and instruc
tion. A student model is a sine qua non for individualized instruction, since it is
the basis for what problems to present, what to teach, and how to adapt teaching
strategies to experience with the student. At the least the student model requires
a systematic description of the skills and subskills to be learned, but very often it
is as elaborate as the "expert" model of the domain, containing not only the stu
dent's knowledge (correct and incorrect) but her goals and strategies for em
ploying it. Since presumably the goal of learning is for the student to become
more like an expert, the differences between the two are relevant to instruction,
and accordingly the student and expert models cannot be incomparable. A great
range of student models, differing in the kind of data required, the way knowl
edge is represented, and relationship to the expert model (VanLehn 1988b), has
been used in ITS.
The simplest student models are what Carr and Goldstein (1977) call "over
lay" models, in which the student's knowledge is represented as the subset of ex
pert skills which he has mastered. The most complex models, e.g those
maintaned by the LISP and Geometry tutors of Anderson and his colleagues
(Anderson, Boyle and Reiser 1985; Anderson, Boyle, Corbett and Lewis 1988;
Anderson 1988), are elaborate computer simulations accounting in detail for the
student's behavior. An important characteristic of most student models in ITS is

the representation of both correct and incorrect knowledge; i.e. student models
represent misinformation and "buggy" procedures as well as a simple lack of
knowledge, as in overlay models. Using the student model, an attempt is made
to account for which specific errors are made and why, so that remediation can
be tailored to the problems of the individual student. In both the decomposition
of skills and the explicit representation of errors ITS models contrast with the
parametric representation of the student — how much he knows or his level of
mastery — commonly maintained in CAI and in the psychometric models under
lying classical test theory and item response theory. Moreover, to the extent that
student models are cognitive models, representing mental processes that lead to
specific behaviors of particular individuals, student models are complex causal
structures requiring more data for fitting than the answers given on tests or the
final solutions to problems.
Two examples of ITS student models which would be realizable in a foreign
language tutor and could be used in research to track student learning are the
procedural networks used by Burton and Brown (Brown and Burton 1978; Bur
ton 1982) in their WEST tutor; and Goldstein's (1982) genetic graphs, used by
his WUSOR program to tutor strategies for the reasoning game WUMPUS. In a
procedural network the student's knowledge is modelled by analogy with a com
puter program, with a set of (LISP) procedures (or, e.g., PASCAL subroutines)
representing the subskills, connected by a control structure corresponding to
calls in a computer program. Several different procedures which accomplish the
same thing may be represented, as can "buggy" procedures which result in incor
rect responses. (Control structures, however, do not contain bugs, nor are there
problems, like limitations of working memory, which are built into more elabor
ate models).
In a genetic graph the nodes correspond to procedural skills, rules,
strategies, misinformation and lack of information that the players may or may
not have. The relationships between the nodes (links) are typed to represent
evolutionary and logical relationships between them, such as refinement, gener
alization, specialization, analogy, and prerequisite; the typed links distinguish
the genetic graph from a simple skill network. To add further structure, nodes
(rules) can be grouped into "islands" to capture separability; and declarative
facts justifying rules can be added. The student model is an overlay of the
genetic graphs, not an overlay of the final, perfect skills which the expert pos
sesses. It thus contains errors and partial knowledge as well as the components
of expert competence. Genetic graphs have obvious implications for tutoring,
e.g. "tutor at the frontier of the student's position in the graph" and "use the
links as a basis of explanation", although other tutorial principles, e.g. "vary

examples", come into play as well.
Both procedural networks and genetic graphs contain a very fine break
down of skills and subskills so that the tutor can focus on the specific problems
the student has. This should certainly be possible for foreign language. Specifi
cation of the relationships between subskills, either in terms of calls or typed
links, would be more difficult, but the rationale behind the sequencing of pres
entations in curricula and textbooks might give some guidance. Evolutionary re
lationships, and the explicit representation of errors that occur at various stages,
are particularly appealing for foreign language teaching because they could cap
ture some of the implications of interlanguage studies and stage theories of lan
guage learning.
The most elaborate student models, those maintained by Anderson's LISP
and Geometry tutors, are comprised of a set of rules, like the expert rules, which
when fired simulate the student's behavior in detail. (The technique is called
"model tracing".) Producing such models is very time consuming, since currently
they require coding "malrules" to capture all of the errors that are likely to be
produced. Moreover fitting them for an individual student requires a substantial
amount of very detailed data. While the interface of Anderson's tutors is con
structed to ensure that the requisite data is availble, this device may not be
possible in domains such as foreign language learning, where responses follow
quickly on one another without much evident intervening calculation. Still An
derson's success and the power of processors like LFG do not rule out model
tracing as a viable long term strategy.
2.3 Diagnostics
The diagnostic component of ITS updates the student model, inferring his
knowledge from his responses; or to put it another way, "the student model is a
data structure, and diagnosis is a process that manipulates it" (VanLehn 1988).
Diagnosis is clearly essential if instruction is to be adapted to the needs and
problems presented by particular students as instruction progresses. Moreover,
if teaching strategies are to be adaptive, their assumptions about a student's
knowledge and behavior have to be tested. Thus diagnosis is intimately linked
not only to the student model but to the planner as well. As Olsson (1986)
stresses, the question is not so much "what is in the student's head", as "what do
we need to know in order to teach?" For complex cognitive activities, detailed
student models, and a rich array of teaching tactics, it would be necessary to
know a good deal.
The diagnostic methods being developed in ITS and cognitive science more
generally have great potential payoff for the foreign language field, where work
on assessment and testing has been dominated by considerations of how much a
student knows, and where test results have been used for purposes of statistical
comparison and certification, rather than for the important didactic goals of
determining what a particular student knows at a particular point in time. Conti
nuing the contrast between diagnosis in ITS and ability testing, for teaching it is
generally not sufficient to know whether or not a student has mastered a particu
lar skill or gets a question right or wrong: it is equally important to know exactly
what errors he makes and why he might have made them. A further, closely re
lated difference in the treatment of errors between diagnosis and ability testing
is that in diagnosis most errors are treated as systematic, not random, and thus to
be accounted for by student models, although allowance is made for perfor
mance lapses due to fatigue, boredom, memory failures, distractions, and the
like. Diagnosis, then, differs from testing in method as well as intent.
VanLehn (1988) distinguishes nine types of diagnostic techniques in ITS
based on his typology of student models, but here it will be sufficient to discuss
the three broad classes suggested by Olsson (1986). All support student models
which at least contain a detailed specification of the skills and subskills being
taught. The simplest diagnostic methods relate to overlay models, where the stu
dent's knowledge is described as a subset of expert knowledge, without any par
ticular attention paid to misinformation or distortions (i.e. "errors"). Overlay
methods are quite similar to those of CAI and ability testing, with the stipula
tions that the skills to be "tested" are highly disaggregated and mastery of each
subskill evaluated. As a consequence, overlay methods are practical for use out
side the context of ITS. For example Marshall (1980, 1981), not herself an ITS
researcher but drawing on its methods, has developed several algorithms for
choosing problems for presentation in an adaptive testing framework which en
able overlay models to be fittted efficiently.
A second class of models focus on error descriptions, attempting to account
for specific errors at the behavioral level (e.g. specific incorrect answers to a
problem or exercise) by postulating computational mechanisms that could pro
duce them. Burton's analysis of "bugs" in multicolumn subtraction is an example
(Burton 1982; Brown and Burton 1978). Bugs, i.e. incorrect procedures which
may or may not produce errors depending on the problem, are represented in
the procedural network (described in the previous subsection) along with cor
rect procedures. Predictions of responses to problems presented are calculated
for all possible correct and buggy procedures and the best fitting model selected.
For reasons discussed in Burton (1982) this is a very difficult task and the algo
rithm that implements it is computationally very intensive. Nevertheless, with
enough computing power, Burton's methods are possible, and interesting find
ings have been obtained.
The third type of diagnostic method in ITS is simulation, of which Ander
son's "model tracing" procedure is perhaps the best example. Because they are
so detailed and require such specific data, Anderson's simulations are very
closely tied to the specific subject matter the tutor is teaching, and his methods
are difficult to extrapolate to other contexts. Olsson and Langley (1988) have,
however, suggested a simulation method, implemented in the computer pro
gram DPF (Diagnostic Path Finder), which can be used in many ITS domains,
like language learning, where the extensive data required by model tracing is un
available, but where a cognitive model is entertained. Using task analysis (e.g. a
procedural network), selective search, and machine learning techniques, DPF
predicts both the specific behavioral path taken by the student and the strategy
(rules) used to get from each point on the path to the next. Although it is promi
sing because of its psychological underpinnings, DPF has yet to be used in an
ITS.
2A Teaching Knowledge
In addition to knowledge of the domain and knowledge of the student, ITSs

contain explicit representations of their knowledge of teaching. As noted in sec
tion 1.2 the separation of teaching knowledge from domain knowledge is an im
portant difference between ITS and CAI. Tutors, of course, stand or fall on the
quality of their teaching knowledge. The specific tactics used, which are derived
mostly from observations of expert practice, and which for the most part have
not been tested empirically in ITS, are ITS's suppositions about how to success
fully support learning of a given skill. By and large teaching knowledge in ITS is
not that different from what one might obtain if perceptive CAI designers ver
balized why they present the examples and explanations they do. The same
might be said for teaching strategies, where a great deal is to be learned from
the educational and training literature on instructional design. The explicitness
and detail required by ITS, however, makes proposed teaching tactics easier to
test and modify, both in the design stage of the ITS and in a more experimentally
oriented evaluation.
The content of the didactic component, i.e. the style of teaching used and
how it is represented, varies considerably among ITSs, depending both on edu
cational philosophy and on the kind of knowledge (declarative, procedural,
qualitative causal models, control strategies, metacognitive) they are trying to
communicate. The range includes:
- rules of thumb, governing case method dialogues, presentation of examples,

and styles of explanation, as represented by the literally hundreds of t-rules
in GUIDON (Clancey 1987, Appendix E);
- case and example selection rules and teaching goals of Collins and Steven's
inquiry teaching and socratic tutors (Collins and Stevens 1978, 1983; Collins
and Grignetti 1975);
- principles of effective coaching, enunciated by Burton and Brown (1982) for
the arithmetic game WEST (see section 3.2); and
- elaborately rationalized strategies, based on a theory of cognitive learning,
employed in the LISP and Geometry tutors of Anderson and his colleagues
(see section 3.1).
Some AI based instructional systems, like STEAMER, have no real didac

tics at all, and because of this they are more closely allied to the computer-based
microworlds discussed in section 1.3.
In an important review of ITS Olsson (1986) distinguishes teaching tactics,
the specific actions that tutors can take, from teaching strategies (to be discussed
presently), which connect subject matter analysis and the current state of the
student model with the tactics. Not surprisingly Olsson finds that in order to pro
vide adaptive instruction, a tutor must have a wide range of instructional actions
to choose from (his Principle of Versatile Output), but that unless the conditions
under which a particular tactic is to be evoked can be identified, the tactic will
not increase the power of the system (his Principle of Strategic Repertoires).
Considerable research is required to determine what those conditions are, in
foreign language teaching or any other field.
2.5 Planning and Control Structures
As Olsson (1986) has emphasized, all ITSs need to be able to generate a

teaching plan — or as he calls it a teaching strategy — on the basis of their current
tutorial goal, their knowledge of the subject matter, and their assessment of the
student. The plan is concerned, among other things, with the sequencing and se
lection of materials and with the form that presentations should take. Moreover
ITSs must be able to change their plans if the plans prove not to be successful
and if their assumptions about the student prove incorrect. Clearly this kind of
planning must be based on general knowledge of pedagogy and assumptions
about what is likely to work for a particular student at a particular point, and be
cause of this, at a behavioral level at least, planning is not differentiated from
teaching tactics in many ITS implementations. It is worth maintaining the dis-
tinction, however, because in the ITS literature planning points to different, and
equally interesting, directions for empirical research than simply establishing
what works.
The development of planning mechanisms for tutoring is a relatively ne
glected area in ITS and tutors which tackle the issues head on are only now
reaching the prototype stage. As Peachey and MaCalla (1986) point out, how
ever, planning has a long history in AI in connection with determining the physi
cal actions of real or simulated robots. They also note that planning is more
difficult in ITS than in robotics, because in the latter the plan is designed to
change the state of the physical world, which can be observed; while in ITS the
state that needs to be changed is the student's knowledge, which is generally not
fully observable. Furthermore, students are intelligent, independent actors and
in this regard, unlike the worlds of many robots, not entirely predictable. Recent
work in ITS by Peachey and MaCalla (1986), Macmillan and Sleeman (1987),
Russell (1987), and Murray (1988) promises to bring general planning methods
to a practical stage.
Attention to results on planning in AI and ITS are important for research in
foreign language learning because, as Macmillan and Sleeman have stressed,
planning is fundamental to the way human tutors look at their work. Recent em
pirical work by Leinhardt and Greeno (1986) and by Leinhardt and her col
leagues (Leinhardt, forthcoming; Leinhardt and Smith 1985; and Leinhardt,
Weidman and Hammond 1987) has applied AI planning ideas to an analysis of
the difference between novice and expert teachers. The increased richness of
the characterization of teaching behavior which these concepts allow opens up
many new avenues of research. Methods of inferring plans from behavior — the
obverse of planning, called "plan recognition" in AI — like cognitive modelling
and diagnosis, are difficult, time consuming and largely qualitative. Students too
have intentions and plans which govern their interactions with teachers and
computers; these also must be represented and established if student behavior is
to be understood and, from a pedagogical point of view, if they are to be helped.
The success of Johnson and Soloway's PROUST tutor (1987) in inferring the
plans and intentions of beginning programmers in PASCAL from errors in their
code, and the ability of Wilensky's and his students' Unix Consultant (Wilensky,
Arens and Chin 1984; Wilensky et al. 1986) to infer what the user really wants to
know and do from often vague and ambiguous requests, similarly indicate inter
esting empirical research directions.
Another relatively neglected area of concern in ITS is the whole question of
curriculum, i.e. the selection and sequencing of topics for instruction. Because of
its research and development orientation and the difficulty of encoding all of the
knowledge required, ITS designers have been concerned for the most part with
prototyping and producing instruction at the level of the lesson. (Anderson's

LISP tutor is an exception here.) Questions of how the knowledge to be mas
tered relates to other knowledge that would be presented in a course, and how
the student's engagement with the ITS relates to the rest of his educational ex
perience, have not arisen. But as Lesgold (1988) has persuasively argued, cur
riculum development, like planning at the level of the lesson, requires a careful
analysis of the goal structure of knowledge (prerequisites, dependencies, part-
whole relations) as well as a domain expert which captures the cognitive content.
To a certain extent the procedural networks and genetic graphs discussed in sec
tion 3.2 address these issues. A related approach is that of the "curriculum infor
mation networks" (CIN) used by the BIP tutors for the computer language
BASIC (Westcourt, Beard and Gould 1977; Westcourt, Beard and Barr 1981). A
CIN is a skill network with labeled links encoding such relationships as prereq
uisite, analogue, harder-than, component-of, kind-of, and functional depend
ency; unlike genetic graphs and procedural networks it does not represent
misinformation. BIP takes the CIN, an overlay model of the student, and a cata
logue of prewritten exercises, and makes decisions concerning which exercise to
present, thus automating many of the decisions built into the branching structure
of CAI. As Halff (1988) has noted, the literature on instructional systems design
is very useful for the task analysis which this kind of curriculum planning entails.
Lesgold (1988), Bonar, Glaser, and their students at the Learning Research and
Development Center at the University of Pittsburgh are designing a tutoring
architecture, using object oriented programming, focusing on knowledge goals,
but this work is still in the development stage.
2.6 The Interface and the Environment
The last two components of ITS architecture, the environment and the in
terface, comprise the computer as experienced by the learner. The environment
consists of the tasks the student is given and the tools he is given to perform
them. It shades off into the interface, which determines how the student inter
acts with the tutor and with the domain. Many of the issues and alternatives in
the design of ITS environments have already been discussed in the section on
computer-based microworlds. Factors listed by Burton (1988), in an insightful
review, cover many of the issues in ITS in general: the knowledge to be learned
(communicated); the appropriate level of abstraction; the fidelity (verisimili
tude), in various respects, with which the knowledge needs to be represented;
sequencing of tasks, and adaptation of tools and props to the stage of learning;
the amount of structure imposed on exploration of the environment by task de-
finition; and the help provided (assistance in doing parts of the problem, aiding
the learner to reflect on his own performance and skills, coaching etc.).
Interface issues are closely related to teaching tactics and to those aspects of
the implementation of the teaching plan having to do with dialogue structure,
but different psychological and sociological considerations are involved. Al
though the interface has received relatively little systematic attention in the ITS
literature (see, however, Hollan, Hutchins and Weitzman 1984; Frye and Solo-
way 1986; and Wenger 1987), this is beginning to change as a result of increased
interest in the general issues of interface design in computer science and AI.
(See Miller 1988, for a thorough review of the state of the art and its implica
tions for ITS.) One such issue, for example, is effective online help, a common
problem for most computer systems and applications. Only a fine line separates
help from coaching and tutoring, since to achieve their immediate goals users
often need to acquire some understanding of what they really want to do, how
the system works, and what their options are. Some very interesting research,
closely allied to ITS, is being carried out in the design of intelligent help systems
(e.g. Wilensky, Arens and Chin 1984; Wilensky et al. 1986; Fischer 1988). Of
particular note is the empirical and theoretical work of Breuker and his col
leagues (Breuker, forthcoming; Winkels and Sandberg 1987; Winkels, Sandberg
and Breuker, forthcoming) focusing on coaching and teaching strategies. Intelli
gent help may be feasible where fullblown intelligent tutoring is not because the
limited nature of the domain makes knowledge representation possible, and be
cause help requires less by way of explanation and actual problem solving than
tutoring. User modelling, planning, and effective explanatory tactics are still
necessary, however. With the advent of CD-ROM and other multimedia learn
ing environments, interface issues will become very much more severe. Careful
design will be required if these new technologies are to be effective.
There are important considerations having to do with learning which bear
on the design of the interface. In Anderson's tutors the interface is carefully de
signed to hold some of the information about the problems being solved on
screen in order to minimize burdens on short term memory and allow students
to concentrate on acquiring effective procedures (Lewis, Milson and Anderson
1987; Anderson, Boyle and Reiser 1985; Anderson, Boyle, Farrell and Reiser
1987). The interface also provides Anderson's model tracing procedures with
the information needed to infer the learner's state of knowledge and problem-
solving strategies. (In this respect it is the functional equivalent of "think aloud"
and other verbal protocols used in cognitive science; see Ericsson and Simon
1984).
Since the interface in ITS (and CAI) determines the final form of the com
munication between the program and the user, the dialogue between the two
could be managed at this level. Whether and to what extent the student or the
ITS communicates in natural language then becomes an important issue, as yet
to be resolved (see Burton, Brown and De Kleer 1982, for discussion and experi
ence with SOPHIE I, II and III). Other interface issues which should be syste
matically explored - for CAI as well as ITS, and not only for computer-based
instruction — include: the required speed of response of the tutor for various ac
tions; screen management and the amount and type of information with which
the user must deal; helping the student keep track of where he has been and
where he is going during a session; and the pedagogical use of computer
graphics and other visual aids.
3 ITS and learning
From a research point of view the fundamental questions about foreign lan
guage learning are "what do people know?" and "how do they learn it?" Rela
tively concrete answers to these questions are also essential to good teaching
and to the design of tools to support it. As pointed out above AI researchers
have had to think carefully about learning and effective teaching because they
have had to build explicit rationales and prescriptions into their didactic knowl
edge bases, teaching strategies, and planning mechanisms, on the one hand, and
into the design of computer-based microworlds and the environments and inter
faces of ITS, on the other. Much of the empirical analysis of ITS and micro-
worlds has been directed toward testing these assumptions. The problems of
applying learning theories developed in the substantive domains studied by ITS
(e.g. mathematics, the maintenance and repair of complex equipment, and pro
gramming) to foreign language learning should certainly not be minimized. One
of the hardest learned lessons of ITS, and AI in general, is how domain-specific
knowledge and successful procedures really are. Nevertheless it is encouraging
that for some complex, cognitive skills, theoretical models can be formulated
which make it possible to study learning in some depth and to design successful
learning environments in a principled way.
In this section I review two approaches to the conceptualization of learning
which have guided the development of ITS and which derive from experiences
with it. As cognitive theories, both stand in stark contrast to the behaviorism that
underlies most CAI. The first, exemplified by Anderson's ACT* theory of skill
acquisition and the tutoring principles derived from it, brings a general theory of
learning and cognition to bear on the design of tutors for very specific skills. The
second, exemplified by the cognitive apprenticeship framework of Collins,
Brown and Newman, synthesizes a wide range of experiences with apprentice-
ship, microworlds, coaching, and ITS, in the form of a heuristic guide to the de
sign of learning environments. While these theories are very different in form
and motivation, each in its own way suggests interesting new avenues of research
and development in foreign language learning.
3 1 Theories of Skill Acquisition
The deepest and most explicit formulation of learning principles motivating

an ITS is a major theory of learning for cognitive skills in its own right, Ander
son's ACT* theory of cognitive performance and skill acquisition. Indeed the
design principles underlying the text processing, LISP, Geometry, and Algebra
tutors built by Anderson and his colleagues at Carnegie Mellon were derived
from ACT* and are specifically intended as empirical tests of ACT*'s validity. It
is an elaborate, complex, evolving theory (the latest incarnation is called PUPS),
and only the main points related to tutoring can be noted here. (For further dis
cussion see Anderson 1983, 1986, 1987, 1988; Anderson and Reiser 1985; An
derson, Boyle and Reiser 1985; Anderson, Boyle and Yost 1985; Anderson,
Boyle, Corbett and Lewis 1988; Anderson, Boyle, Farrell and Reiser 1987;
Lewis, Milson and Anderson 1987.)
The principal foci of ACT* are on the structure and operation of short term
memory and the nature of knowledge. Short term memory is postulated to be
very limited; many errors observed in cognitive tasks are accounted for by these
limitations, not by lack of understanding. With regard to knowledge, following
Winograd (1975) and a large literature in AI, a basic distinction is made be
tween declarative knowledge, which is encoded quickly in schema-like struc
tures, without reference to how it will be used; and procedural knowledge, which
is embodied in highly efficient and use-specific forms and acquired through put
ting declarative knowledge into practice. Procedural knowledge takes the form
of a production system (see also Klahr, Langley and Neches 1987), i.e. a set of
condition-action rules, with conditions including goals, such that if a condition is
encoded in working memory the action should take place. In a formal sense pro
duction rules are like the rewrite rules of a grammar, but in an ACT* model of
language acquisition (e.g. Anderson 1987, Ch.7) they would not be restricted to
syntax. There are many hundreds of such rules, describing correct and incorrect
("buggy") programming in LISP and theorem proving in geometry, in the mod
els of expert and student knowledge in Anderson's tutors. In a language tutor se
mantic, syntactic, pragmatic and discourse rules, expressed in terms of what
actions have to taken to achieve communicative goals, would serve this function.
264 RALPH B.GINSBERG
At the beginning of a process of skill acquisition people work from declara

tive knowledge, holding it in working memory and transforming it into behavior
in a calculated way by various "weak" methods such as analogy. As learning pro
gresses more automatic and faster procedural knowledge is built up by a basic
mechanism called "knowledge compilation". The term implies a strong analogy
with the compilation of computer programs (see Wilensky 1986), where high
level code employing general and easy-to-use representations (e.g. FORTRAN)
is translated into a lower level (object) language which eliminates a lot of the
overhead (e.g. blind search, testing conditions that do not arise in the task at
hand, and inefficient deductions etc.) of converting the high level language into
machine instructions and physical actions. As with computer languages, the cost
of compilation is unintelligibility in the code, and this, along with the fact that
procedural knowledge is very situation-specific, is what accounts for the diffi
culty in verbalization often associated with expert knowledge. Compilation takes
two main forms in Anderson's theory: "proceduralization", where specific rules
for special cases are derived from general rules; and "composition", where sev
eral separate rules that might be applied sequentially are converted into a single
procedure. ACT* also has a mechanism called "strengthening" which further
speeds up performance. Inductive mechanisms, however, such as discrimination
and generalization, common in other cognitive theories and in other production
system models of learning (Klahr, Langley and Neches 1987), play no role in
ACT*.5
The teaching strategies of ACT* tutors are, then, derived from this general
theory. ACT* tutors are very directive. They are designed to facilitate knowl
edge compilation by seeing that the student comes to the correct final proce
dural code. Their methodology is called "model tracing", which takes a correct
model of the skill and the student's errorful procedures, finds a path to the cor
rect model, and insists that the student stay on it. Since the end result of the tu
toring process is knowledge-in-use (procedural knowledge), general instruction
is given only in the context of solving specific problems. Goals, the conditions of
the production rules, are made explicit. Since ACT* does not contain inductive
mechanisms, carefully juxtaposed examples to guide induction are unimportant:
students are simply told what the critical features are. Finally immediate feed
back on errors and various aspects of the tutor's interface (the information dis
played on the computer screen) help students to manage working memory, so
that they do not make mistakes due to memory overload and can concentrate on
developing understanding and procedural skills.
Anderson's ACT* (PUPS) is not the only general theory of skill acquisition
related to ITS. VanLehn and Brown's Repair Theory (VanLehn and Brown
1980; Brown and VanLehn 1982; VanLehn, Brown and Greeno 1984), and Van-
Lehn's subsequent elaborations of the learning mechanisms in his SIERRA

learning simulation program (VanLehn 1985a, 1985b, 1988a, 1988b; Wenger
1987, Ch.8) represent an equally comprehensive but quite different approach to
cognitive performance, learning, and knowledge representation (procedural
skills are represented as generalized AND/OR graphs, as opposed to the pro
duction systems). Repair theory derives from attempts to explain data on the na
ture, origin and remediation of "bugs" observed in Burton and Brown
subtraction tutors (Brown and Burton 1978; Burton 1982; Burton and Brown
1982). (Bugs are systematic but flawed procedures, which not only account for
which problems will be answered incorrectly but precisely which wrong answers
will be given — more than 100 simple bugs, and many compound bugs, have been
described just for multicolumn subtraction!) The key concept is that of "im
passe" or failure driven learning, in which incomplete or "buggy" procedures
lead to evident errors and intended actions that cannot be performed. Errors
and impasses are overcome, often incorrectly, by ad hoc problem solving or "re
pairs", which produce new procedures on the way to mastering the skill. Besides
the skill representation, the theory consists of a set of heuristic mechanisms that
generate the possible repairs and a set of "critics" which filter out some repairs
as unacceptable. While the intuitive underpinnings of Repair Theory and its suc
cess in accounting for systematic errors are very suggestive for language learn
ing, the specific mechanisms discussed by Brown and VanLehn cannot be
extrapolated as easily as ACT*'s from the domain (multicolumn subtraction)
they are designed to explain. As VanLehn (1988a) points out, however, similar
ideas are found in the work of Wexler and Culicover (1980) and Berwick (1985)
on language acquisition, so this general approach merits careful study by lan
guage researchers.
3.2. A Framework for Studying and Designing Successful Learning Environments
As attractive as they might be on general scientific grounds the difficulty

and expense of building large and comprehensive computer models like ACT*
and SIERRA makes their implementation for foreign language learning at best
a long term goal. A product of ITS research which is of more immediate relev
ance to the study of foreign language learning is the framework constructed by
Collins, Brown and Newman (1987) to facilitate consideration of pedagogical
and theoretical issues that arise in designing or evaluating learning environ
ments. As a guide to design, it specifies conditions for the successful construc
tion of ITS, microworlds, and even classroom-based learning environments. As a
guide to research, it enumerates the main factors that that must be taken into ac
count in empirical studies of what actually works.
The framework is based on a wide ranging and insightful analysis of the
common conditions associated with successful learning in a number of diverse
settings:
- traditional apprenticeship for occupational skills such as tailoring (Lave

forthcoming);
- a detailed analysis of three "success models" in schools — Palinscar and
Brown's (1984) reciprocal teaching method for reading; Scardamalia and
Bereiter's (1983, 1985) procedural facilitation for writing; and Schoenfeld's
(1983, 1985) methods of teaching mathematical problem solving — which
are characterized by the same principles as apprenticeship and accordingly
termed "cognitive apprenticeship;"
- skiing instruction, a success model for a complex procedural skill (Brown,
Burton and Fischer 1986);
- microworlds, as described in section 1.3 above, and reactive learning envi
ronments, e.g. SOPHIE; and, of course,
- intelligent tutoring systems.
It builds on the earlier paradigm of "increasingly complex microworlds"

(ICM) of Burton, Brown and Fischer (1984; also Fischer 1988), and is now being
extended in the the work of Brown, Collins and Duguid (1988); see also Collins
and Brown (1988).
The key concept underlying all of the success models is "situated learning",
i.e. knowledge acquired in the social and functional context of its use. This is
contrasted with formal schooling, which has tended to emphasize the orderly
presentation of factual knowledge, abstracted (dissociated) from the context of
its use. The framework consists of a checklist of factors that must be considered
in the design of successful situated learning, and a set of suggestions, synthesized
from the "success models" above, as to how to employ them. It has four major
divisions: content, methods, sequencing, and sociology.
1. Content refers to the types of knowledge which are acquired in learning. Its
main categories are factual and procedural (domain) knowledge on the one
hand; and strategic knowledge (heuristic strategies for accomplishing tasks; con
trol strategies, such as planning at various levels, monitoring, and diagnosis; and
learning strategies) on the other.
2. Methods refer to the kinds of help given the learner in acquiring skills and
discovering knowledge. They include: making the sources and consequences of
errors apparent; "modelling", to enable student to build a model of expert prac-
tice by making explicit what is largely implicit; coaching, offering help to bring
the student closer to expert practice, and coaching aimed at executable advice
(advice that can be followed); "scaffolding", providing supports for carrying out
simplified but real tasks, and "fading", removing supports as skill develops; re
flective comparison with experts, other students, and ultimately with the lear
ner's inner cognitive model; and exploration of interesting subtasks and methods
subsumed under a general goal.
3. Sequencing addresses changing learning needs in different phases of ac
quisition process. The main considerations here are the management of increas
ing complexity and increasing diversity in the tasks the student is given; and
coordinating instruction of local vs. global skills. The elements of sequencing are
addressed in detail in the earlier ICM framework. They include: maintaining
motivation through success, avoiding the dangers of oversimplification (unjusti
fied extrapolation and an unwillingness to try new things), structuring the envi
ronment so that progressively less simple but still realistic versions of the target
expert skill are learned, and using task specification to focus attention on im
portant factors in the microworld.
4. Sociology refers to the social context of learning. Besides the methods of
situating knowledge so that students come to understand its purposes and condi
tions of use, the category directs attention to exposure to examples of expert
practice and active communication about expertise; intrinsic motivation in the
tasks; and exploitation of cooperation and competition in the social situation of
learning. It is here that the framework has the greatest bearing on classroom-
based learning.
While the framework directs us to look in particular directions to charac
terize good and bad instruction — and effective and ineffective learning — much
work needs to be done in variable specification and measurement in the foreign
language field before the framework can be used in rigorous research. As with
ACT*, experience with learning environments designed specifically according to
these principles will clearly play an important role. It would be an interesting
exercise for language pedagogues to reformulate the principles of successful lan
guage learning in these terms, if for no other reason than to clarify the simi
larities and differences between language learning and the learning of other
cognitive and procedural skills.
4 Conclusion
By way of conclusion, I would like to note and comment on some of the

main themes that emerge in this paper that will be developed in detail with re-
gard to foreign language learning in the sequel. First and foremost the focus of
all of the work that has been reviewed here has been on learning. Designers of
ITSs and microworlds have looked very concretely at what is to be learned, how
it is to be learned, and how learning can be effectively supported. The artifacts
that they have designed have been explicitly motivated by this analysis. By con
trast many applications of advanced technologies to foreign languages have
taken learning issues for granted, implicitly relying on accepted educational
methodologies and other components of the larger system within which the new
technologies are to be embedded to achieve their instructional goals. It is in the
analysis of learning that the main interest of ITS for the foreign language field
lies.
Second, the orientation of AI research in ITS and microworlds is essentially
cognitive, although motivation is not ignored. The primary concern is with the
acquisition of knowledge, and with the cognitive and metacognitive processes,
and complex procedural skills, that put knowledge to use. Models of expert and
student knowledge are at the heart of ITS. Further, as Papert has so cogently ar
gued, the design of tasks and microworlds, which contain no specific tutorial in
tervention but which allow the student to "discover" the relevant knowledge by
his own natural learning devices, is basically an epistemological enterprise.
Work in foreign language learning has put relatively little emphasis on the pre
cise specification of cognitive mechanisms involved in learning as compared
with other factors (e.g. presumed individual differences in motivation, learning
and cognitive styles). Redressing this imbalance, I shall argue, is essential for the
design of effective foreign language learning.
Third, as in many other problems in AI, the representation of knowledge
turns out to be the key research and design problem. Many would claim that it is
in the area of knowledge representation that AI has had its greatest successes.
Knowledge is represented in ITS not only in the data structures and manipula
tions of the expert model, but in the physical structures of its tasks, interfaces,
and microworlds. Both domain knowledge and goal structure need to be repre
sented. If ITSs are to be built for foreign language learning it is the problems of
knowledge representation that must first be tackled. The knowledge of language
that can be usefully represented will determine the possibilities for ITS in
foreign language learning, and this must be carefully assessed.
Fourth, while one can usefully talk about the elements of ITS architecture
separately, it is striking how interdependent they are and, accordingly, how im
portant it is to design them concurrently. Teaching must be adapted to the na
ture of the knowledge to be learned, and correlatively knowledge representation
must take into account the demands of teaching. Diagnosis updates student
models and at the same time guides the instructional planner: it is concerned
with what we need to know in order to teach. The interface and the environ
ment—the medium in which communication, interaction, and learning takes
place — not only contain information about the domain, they also shape the way
the learner reflects on her knowledge and on her own learning processes. They
must accordingly be designed with learning considerations in mind.
Fifth, although I do not discuss this at all in the paper, it would be an inter
esting exercise to stand many of the criticisms of CAI and ITS on their heads and
look at learning in the classroom in the light of the kinds of considerations that
the builders of ITSs have faced. The first question would be the obvious one:
given the knowledge to be communicated, and given what is known about the
way people learn, would classrooms (a teacher, several students etc.) be in
vented in the first place, and if so how would they be structured? Or to put it an
other way, for communicating knowledge, what is the comparative advantage of
the educational technology presently in use, and for what kinds of knowledge is
it particularly effective? With regard to what I have been calling learning envi
ronments and diagnosis, one could ask similar questions about materials and
testing in traditional educational settings. (Of course there might be other rea
sons to invent classrooms, textbooks, and testing besides their efficacy in com
municating knowledge.) I realize it is hard to treat such questions as anything
but rhetorical or polemical, but they do have a core of scientific content which
debates on education in our rapidly changing society cannot ignore.
Sixth, with regard to empirical research strategies, the sections on student
modelling and diagnosis describe rigorous computational models and methods
of analyzing data that point in very different directions from the statistical mod
els and methods which have dominated the educational and evaluational lit
erature on learning in instructional settings. The difference lies not so much in
the formalisms employed, although these are very different indeed, but in the
kinds of questions that are addressed and the goals that are served. Pace: of
course empirical work on foreign language learning needs both.
Finally, I would not want to exaggerate the extent to which any existing ITSs
or microworlds have reached the educational goals that they have set. That is a
difficult empirical question on which there has been lamentably little systematic
research. As I indicated at the outset, however, most ITSs are prototypes whose
primary purpose is research, and it is the results of this research that I have
stressed. Here, the ideas and methods developed have many fruitful and practi
cal applications to foreign language learning, in ways that will be the subject of
another subsequent paper.
Notes
1. One might with some justification say that the major achievements of AI as a discipline have
been primarily in the area of knowledge representation. On the declarative vs. procedural
representations see Winograd 1975. With reference to ITS, see Clancey's (1986b, 1987) dis
cussion of the motivation for NEOMYCIN as a basis for GUIDON; and Anderson (1988).
Although a useful one, the declarative/procedural distinction cannot be pushed too far, as
VanLehn (1988b) has cogently argued.
2. To anticipate a little, using insights generated by ITS research, misleading dichotomies, like
learning vs. acquisition, can be reformulated in a more general theory of learning pertinent
to instructional settings.
3. The word is used in many different senses, even by the same author (see Lawler 1987); some
key phrases, overlapping Pea's, are "limited, simplified slices of reality", "worlds with limited
possibilities", "fixed and limited objects, properties, and relations", "problem spaces", and
"task domains along with the tools to operate in them".
4. How systematic errors really are is, of course, an empirical question. Whether errors are
treated as random depends on how important and how hard it is to characterize them, as
well as the social and psychological aspects of assessment method in the domain.
5. Of course, conditioning, reinforcement, choice probabilites, and the rest of the conceptual
repertoire of behavioristic theories of learning are irrelevant altogether.
References
Abelson, H., and A.A. DiSessa. 1980. Turtle Geometry: The Computer as a Medium for Exploring
Mathematics. Cambridge, MA: MIT Press.
Anderson, J.R. 1983. The Architecture of Cognition. Cambridge, MA: Harvard.
Anderson, J.R. 1986. "Knowledge compilation: the general learning mechanism." Machine Learn-
ing (volume 2) ed. by R. Michalski, J. Carbonnell and T. Mitchell, 202-217. Palo Alto: Tioga.
Anderson, J.R. 1987. "Skill acquisition: compilation of weak-method problem solutions." Psycho-
logical Review 94.192-210.
Anderson, J.R. 1988. "The expert module." Poison and Richardson 1988.
Anderson, J.R. B.J. and Reiser. 1985. "The LISP tutor." Byte 10.159-175.
Anderson, J.R., C F . Boyle and B.J. Reiser. 1985. "Intelligent tutoring systems." Science 228.456-
458.
Anderson, J.R., C.F. Boyle and G. Yost. 1985. "The GEOMETRY tutor." Proceedings of the
Ninth International Joint Conference on Artificial Intelligence ed. by A. Joshi. Los Altos:
Kaufmann.
Anderson, J.R., C.F. Boyle, A. Corbett and M. Lewis. 1988. Cognitive modelling and intelligent tu-
toring. Draft. Pittsburgh: Carnegie Mellon.
Anderson, J.R., C.F. Boyle, R. Farrell and B.J. Reiser. 1987. "Cognitive principles in the design of
computer tutors." Modelling Cognition ed. by P. Morris. New York: Wiley.
Barchan, J. 1987. Language Independent Grammatical Error Reporter.
Barchan, J., B. Woodmansee and M. Yazdani. 1986. "A prolog-based tool for French grammatical
analysis." Instructional Science 15.21-48.
Barchan, J. and J. Wusterman. 1988.A Prolog-base d tool for grammatical analysis of Western Euro-
pean Languages. Research Report. Exeter, UK: Computer Science Department, University
of Exeter.
Berwick, R. 1985. The Acquisition of Syntactic Knowledge. Cambridge, MA: MIT.
Breuker, J. Forthcoming. "Coaching in help systems." To appear in Intelligent Computer-Aided In-
struction. ed. by J.Self. In press. London: Chapman and Hall.
Brown, J.S., and J. Greeno, chairmen. 1984. "Report of the research briefing panel on information
technology in precollege education." Research Briefings 1984. Washington, DC: National
Academy Press.
Brown, J.S. and R.R. Burton. 1978. "Diagnostic models for procedural bugs in basic mathematical
skills." Cognitive Science 2.155-192.
Brown, J.S. and R.R. Burton. 1987. "Reactive learning environments for teaching electronic
troubleshooting." Advances in Man-Machine Systems 3.65-98.
Brown, J.S., A. Collins and P. Duguid. 1988. Cognitive apprenticeship, situated cognition and social
interaction. ( = Institute for Research on Learning Report, 8.) Palo Alto: Tioga.
Brown, J.S., T.P. Moran and M.D. Williams. 1982. The semantics of procedures: a cognitive basis
for maintenance training competency. Palo Alto: Xerox Corporation CIS Working Paper.
Brown, J.S. and K. VanLehn. 1982. "Repair theory: a generative theory of bugs in procedural
skills." Cognitive Science 4.379-426.
Burton, R.R. 1982. "Diagnosing bugs in a simple procedural skill. Sleeman and Brown 1982.157-
183.
Burton, R.R. 1988. "The environment module of intelligent tutoring systems." Poison and Ri-
chardson 1988.
Burton, R.R. and J.S. Brown. 1982. "An investigation of computer coaching for informal learning
activities." Sleeman and Brown 1982.79-98.
Burton, R.R., J.S. Brown and G. Fischer. 1984. "Skiing as a model of instruction." Everyday Cogni-
tion ed. by B. Rogoff and J. Lave, 139-150. Cambridge: Harvard.
Burton, R.R., J.S. Brown and J. De Kleer. 1982. "Pedagogical; natural language and knowledge
engineering techniques in SOPHIE I, II, and III." Sleeman and Brown 1982.227-282.
Carbonell, J.R. 1970. "AI in CAI: an artificial intelligence approach to computer-assisted instruc-
tion." IEEE Transactions in Man-Machine Systems 11.19-202.
Carr, B. and LP Goldstein. 1977. Overlays: a theory of modeling for computer-aided instruction. ( =
Artificial Intelligence Memo, 406.) Cambridge, MA: MIT Press.
Cerri, S. and J. Breuker. 1981. "A rather intelligent language teacher." Studies in Language Learn-
ing 3.182-192.
Clancey, WJ. 1982. "Tutoring rules for generating case method dialog." Sleeman and Brown
1982.201-225.
Clancey, WJ. 1984. "Methodology for building an intelligent tutoring system." Models and Tactics
in Cognitive Science, ed. by W. Kintsch, J. Miller and P. Poison. Hillsdale, NJ: Lawrence Erl-
baum.
Clancey, WJ. 1986a. "Qualitative student models." Annual Review of Computer Science. Palo
Alto: Annual Reviews.
Clancey, W.J. 1986b. "From GUIDON to NEOMYCIN and HERACLES in twenty short lessons
(ONR Final Report 1979-1985)." AI Magazine 7.40-60.
Clancey, W.J. 1987. Knowledge-based Tutoring: the GUIDON Program. Cambridge, MA: MIT.
Clancey, WJ. 1988. "The knowledge engineer as student: metacognitive bases for asking good
questions." Mandl and Lesgold 1988.
Collins, A. and J.S. Brown. 1988. "The computer as a tool for learning through reflection." Mandl
and Lesgold 1988.
Collins, A. and M. Grignetti. 1975. Intelligent CAI ( = BBN Report, 3181.) Cambridge: Bolt Be-
ranek and Newman.
Collins, A. and A.L. Stevens. 1978. "Goals and strategies of inquiry teachers." Advances in Instruc-
tional Psychology ed. by R. Glaser, 65-119. Hillsdale, NJ: Lawrence Erlbaum.
Collins, A. and A.L. Stevens. 1983. "Cognitive theory of interactive teaching." Instructional Design
Theories and Models: An Overview of their Current Status ed. by CM. Reigeluth. Hillsdale,
NJ: Lawrence Erlbaum.
Collins, A., J.S. Brown and S.E. Newman. 1987. "Cognitive apprenticeship: teaching the craft of
reading, writing, and mathematics." Cognition and Instruction: Issues and Agendas ed. by
L.B. Resnick. Hillsdale, NJ: Lawrence Erlbaum.
Dede, CJ., P.P. Zodhiates and C.L. Thompson. 1985. Intelligent computer-assisted instruction: a
review and assessment of ICAI research and its potential f or education. Cambridge, MA: Edu-
cational Technology Center, Harvard University.
Feuerman, K., C. Marshall, D. Newman and M. Rypa. 1987. The CALLE Project. Technical Re-
port. Pasadena: Xerox Corporation.
Fischer, G. 1988. "Enhancing incremental learning processes with knowledge-based systems."
Mandl and Lesgold 1988.
Frye, D. and E. Soloway. 1986. Interface design: a neglected issue in educational software. New-
Haven: Department of computer Science, Yale University.
Geesey, R., R. Ginsberg, J. Lancaster, E. Manukian and L. Reeker. 1989. Learning environments
for scientific and technical competency. Unpublished manuscript.
Goldstein, I.P. 1982. "The genetic graph: a representation for the evolution of procedural knowl-
edge." Sleeman and Brown 1982.51-77.
Goldstein, I.P. and S. Papert. 1977. "Artificial intelligence, language, and the study of knowledge."
Cognitive Science 1.1-21.
Halff, H.M. 1988. "Curriculum and instruction in automated tutors." Poison and Richardson 1988.
Haugeland, . 1984. "First among equals." Models and Tactics in Cognitive Science ed. by W.
Kintsch, J. Miller and P. Poison. Hillsdale, NJ: Lawrence Erlbaum.
Hollan, J.D., E.L. Hutchins and L.M. Weitzman. 1984. "STEAMER: an interactive inspectable
simulation-based training system." AI Magazine 5/2.15-28.
IRL. 1988. The Advancement of Learning. Palo Alto: Institute for Research on Learning.
Johnson, W.B. 1988. "Pragmatic considerations in research, development, and implementation of
intelligent tutoring systems." Poison and Richardson 1988.
Johnson, W.L. and E. Soloway. 1987. "PROUST: an automatic debugger for Pascal programs."
Kearsley 1987.
Kaplan, R.M. and J. Bresnan. 1982. "Lexical-functional grammar: a formal system for grammati-
cal representation." The Mental Representation of Grammatical Relations ed. by J. Bresnan.
Cambridge, MA: MIT Press.
Kearsley, G.P., ed. 1987. Artificial Intelligence and Instruction: Applications and Methods. Reading:
Addison-Wesley.
Klahr, D., P. Langley and R. Neches, eds. 1987. Production System Models of Learning and Devel-
opment. Cambridge, MA: MIT Press.
Lave, J. In preparation. Tailored learning: apprenticeship and everyday practice among craftsmen in
West Africa. Stanford: IRL.
Lawler, R.W. 1987. "Learning environments: now, then, and someday. Lawler and Yazdani
1987.1-25.
Lawler, R. W. and M. Yazdani, eds. 1987. Learning Environments and Tutoring Systems ( = Artifi-
cial Intelligence in Education, 1.) Norwood, NJ: Ablex.
Leinhardt, G. Forthcoming. "Math lessons: a contrast of novice and expert competence." To ap-
pear in Journal of Research in Mathematics Education.
Leinhardt, G. and J.G. Greeno. 1986. "The cognitive skill of teaching." Journal of Educational
Psychology 78.75-95.
Leinhardt, G. and D.A. Smith. 1985. "Expertise in mathematics instruction: subject matter knowl-
edge." Journal of Educational Psychology 77.247-271.
Leinhardt, G., C. Weidman and K.M. Hammond. 1987. "Introduction and integration of class-
room routines by expert teachers." Curriculum Inquiry 17.135-176.
Lesgold, A. 1988. "Toward a theory of curriculum for use in designing intelligent instructional sys-
tems." Mandl and Lesgold 1988.
Lewis, M.W., R. Milson and J.R. Anderson. 1987. "The TEACHER'S APPRENTICE: designing
and intelligent authoring system for high school mathematics." Kearsley 1987.
Macmillan, S.A. and D.H. Sleeman. 1987. "An architecture for a self-improving instructional
planner for intelligent tutoring systems." Computational Intelligence 3.17-27.
Mandl, H. and A. Lesgold, eds. 1988. Learning Issues for Intelligent Tutoring Systems. New York:
Springer Verlag.
Marshall, S.P. 1980. "Procedural networks and production systems in adaptive diagnosis." Instruc-
tional Science 9.129-143.
Marshall, S.P. 1981. "Sequential item selection: optimal and heuristic policies." Journal of Mathe-
matical Psychology 23.134-152.
McArthur, D. 1986. "Developing computer tools to support performing and learning complex
cognitive skills." Applications of Cognitive Psychology: Problem Solving, Education and Com-
puting ed. by K. Pezdek, D. Berger and B. Banks, 183-200. Hillsdale, NJ: Lawrence Erlbaum.
McArthur, D., C. Stasz and J.Y. Hotta. 1987. Learning problem-solving skills in algebra. Santa
Monica: Rand Corporation Note.
Miller, J.R. 1988. "The role of human-computer interaction in intelligent tutoring systems. Poison
and Richardson 1988.
Morgenstern, D. 1986. "The Athena language project." Hispania 69.740-745.
Murray, J.H., D. Morgenstern, and G. Furstenberg. 1987. The Athena language learning project:
design issues for the next generation of language learning tools. Draft. MIT Press.
Murray, W.R. 1988. Personal communication.
Olsson, S. 1986. "Some principles of intelligent tutoring." Instructional Science 14.293-326.
Olsson, S. and P. Langley. 1988. "Psychological evaluation of path hypotheses in cognitive diag-
nosis." Mandl and Lesgold 1988.
O'Shea, T. and R. Bornat. 1987. A five component model for computer-based training. Unpublished
manuscript.
O'Shea, T., R. Bornat, B. du Boulay, M. Eisenstadt and I. Page. 1984. "Tools for creating intelli-
gent computer tutors." Artificial and Human Intelligence ed. by A. Elithorn and R. Banerji,
181-199. Amsterdam: North Holland.
O'Shea, T. and J.A. Self. 1983. Learning and Teaching with Computers. Englewood Cliffs, NJ:
Prentice-Hall.
Palinscar, A.S. and A.L. Brown. 1984. "Reciprocal teaching of comprehension-fostering and
monitoring activities." Cognition and Instruction 1.117-175.
Papert, S. 1980. Mindstorms: Children, Computers, and Powerful Ideas. New York: Basic Books.
Papert, S. 1986. Rethinking mathematics learnability in a computer culture. Unpublished lecture,
Stanford University.
Park, O-C, R.S. Perez, and R.J. Seidel. 1987. "Intelligent CAI: old wine in new bottles, or a new
vintage?" Kearsley 1987.
Pea, R.D. 1987. "Integrating human and computer intelligence." Pea and Sheingold 1987.128-146.
Pea, R.D. and K. Sheingold, eds. 1987. Mirrors of Minds: Patterns of Experience in Educational
Computing. Norwood, NJ: Ablex.
Pea, R.D. and E. Soloway. 1987. Mechanisms for facilitating a vital and dynamic education system:
fundamental roles for education science and technology. Final Report for OTA, US Congress.
Peachey, D.R. and G.I. MaCalla. 1986. "Using planning techniques in intelligent tutoring sys-
tems." International Journal of Man-Machine Studies 24.77-98.
Poison, M.C. and J.J. Richardson, eds. 1988. Foundations of Intelligent Tutoring Systems. Hillsdale,
NJ: Lawrence Erlbaum.
Psotka, J., L.D. Massey and S.A. Mutter, eds. 1988. Intelligent Tutoring Systems: Lessons Learned.
Hillsdale, NJ: Lawrence Erlbaum.
Russell, D.M. 1987. "The instructional design environment: Interpreter." Psotka, Massey and
Mutter 1988.
Scardamalia, M. and C. Bereiter. 1983. "Child as co-investigator: helping children gain insight into
their own mental processes." Learning and Motivation in the Classroom ed. by S.G. Paris, G.
Olson and H. Stevenson, 61-82. Hillsdale, NJ: Lawrence Erlbaum.
Scardamalia, M. and C. Bereiter. 1985. "Fostering the development of self-regulation in children's
knowledge processing." Teaching and Learning Skills: Research and Open Questions ed. by S.
Chipman, J.W. Segal and R. Glaser. Hillsdale, NJ: Lawrence Erlbaum.
Schank, R.C. 1984. The Cognitive Computer. Reading, MA: Addison-Wesley.
Schoenfeld, A.H. 1983. Problem solving in the mathematics curriculum: a report, recommendations
and an annotated bibliography. ( = MAA Notes, 1.) The Mathematical Association of Ameri-
ca.
Schoenfeld, A.L. 1985. Mathematical Problem Solving. New York: Academic Press.
Sleeman, D. and J.S. Brown, eds. 1982. Intelligent Tutoring Systems. London: Academic Press.
Suppes, P., ed. 1981. University-Level Computer-Assisted Instruction at Stanford: 1968-1980. Palo
Alto: Stanford University, Institute for Mathematical Studies in the Social Sciences.
Uren, J. and M. Yazdani. 1988. Spanish LINGER. Research Report. Exeter, UK. University of
Exeter, Computer Science Department.
VanLehn, K. 1985a. Acquiring procedural skills from lesson sequences. ( = Technical Report ISL,
9.) Palo Alto: Xerox Corporation.
VanLehn, K. 1985b. Learning one subprocedure per lesson. ( = Technical Report ISL, 10.) Palo
Alto: Xerox Corporation.
VanLehn, K. 1988a. "Toward a theory of impass-driven learning." Mandl and Lesgold 1988.
VanLehn, K. 1988b. "Student modeling." Poison and Richardson 1988.
VanLehn, K. and J.S. Brown. 1980. "Planning nets: a representation for formalizing analogies and
semantic models for procedural skills." Apptitude Learning and Instruction, vol.2: Cognitive
Process Analysis of Learning and Problem Solving ed. by R.E. Snow, P.A. Federico and W.E.
Montague. Hillsdale, NJ: Lawrence Erlbaum.
VanLehn, K., J.S. Brown and J.G. Greeno. 1984. "Competitive argumentation in computational
theories of cognition." Models and Tactics in Cognitive Science ed. by W. Kintsch, J. Miller,
and P. Poison. Hillsdale, NJ: Lawrence Erlbaum.
Wenger, E. 1987. Artificial Intelligence and Tutoring Systems: Computational and Cognitive Ap-
proaches to the Communication of Knowledge. Los Altos: Kaufmann
Westcourt, K., M. Beard and L. Gould. 1977. "Knowledge-based adaptive curriculum sequencing
for CAI: application of a network representation." Proceedings of the National ACM Con-
ference, Seattle Washington.234-240. New York: ACM.
Westcourt, K., M. Beard and A. Barr. 1981. "Curriculum information networks for CAI: research
on testing and evaluation by simulation." Suppes 1981.817-839.
Wexler, K. and Culicover, P. 1980. Formal Principles of Language Acquisition. Cambridge, MA:
MIT Press.
Wilensky, R. 1986. Common LISPcraft. New York: Norton.
Wilensky, R., Y. Arens and D. Chin. 1984. "Talking to UNIX in English: and overview of UC."
Communications of the ACM 27.574-593.
Wilensky, R., et al. 1986. "UC — a progress report." Berkeley: Computer Science Division, Univer-
sity of California, Report no. UCB/CSD 87/303.
Winkels, R. and J. Sandberg. 1987. The EUROHELP coach ( = Memo 94 of the VF Project.) Am
sterdam: Department of Social Science Informatics, University of Amsterdam.
Winkels, R., J. Sandberg and J. Breuker. 1986. Coaching Strategies and tactics for IHSs ( = Memo
78 of the VF Project.) Amsterdam: Department of.Social Science Informatics, University of
Amsterdam.
Winograd, T. 1975. "Frame representations and the declarative/procedural controversy." Repre-
sentation and Understanding: Studies in Cognitive Science ed. by D.G. Bobrow and A.M. Col-
lins, 185-210. New York: Academic Press.
Winograd, T. and F. Flores. 1985. Understanding Computers and Cognition: A New Foundation for
Design. Norwood: Ablex.
Xerox. 1985. CALLE Project Final Report. Pasadena: Xerox Special Information Systems, Vista
Laboratory.
Yazdani, M., ed. 1984. New Horizons in Educational Computing. New York: Wiley.
Yazdani, M. 1986. "Intelligent tutoring systems: an overview." Expert Systems 3.154-162.
Yazdani, M. 1988. Language tutoring with Prolog. Draft. Exeter, UK: University of Exeter, Depart
ment of Computer Science.
In the series Studies in Bilingualism (SiBil) the following titles have been published thus far or
are scheduled for publication:
35 Rocca, Sonia: Child Second Language Acquisition. A bi-directional study of English and Italian tense-
aspect morphology. 2007. xvi, 240 pp.
34 Koven, Michèle: Selves in Two Languages. Bilinguals' verbal enactments of identity in French and
Portuguese. 2007. xi, 327 pp.
33 Köpke, Barbara, Monika S. Schmid, Merel Keijzer and Susan Dostert (eds.): Language Attrition.
Theoretical perspectives. 2007. viii, 258 pp.
32 Kondo-Brown, Kimi (ed.): Heritage Language Development. Focus on East Asian Immigrants. 2006.
x, 282 pp.
31 Baptista, Barbara O. and Michael Alan Watkins (eds.): English with a Latin Beat. Studies in
Portuguese/Spanish – English Interphonology. 2006. vi, 214 pp.
30 Pienemann, Manfred (ed.): Cross-Linguistic Aspects of Processability Theory. 2005. xiv, 303 pp.
29 Ayoun, Dalila and M. Rafael Salaberry (eds.): Tense and Aspect in Romance Languages. Theoretical
and applied perspectives. 2005. x, 318 pp.
28 Schmid, Monika S., Barbara Köpke, Merel Keijzer and Lina Weilemar (eds.): First Language
Attrition. Interdisciplinary perspectives on methodological issues. 2004. x, 378 pp.
27 Callahan, Laura: Spanish/English Codeswitching in a Written Corpus. 2004. viii, 183 pp.
26 Dimroth, Christine and Marianne Starren (eds.): Information Structure and the Dynamics of
Language Acquisition. 2003. vi, 361 pp.
25 Piller, Ingrid: Bilingual Couples Talk. The discursive construction of hybridity. 2002. xii, 315 pp.
24 Schmid, Monika S.: First Language Attrition, Use and Maintenance. The case of German Jews in
anglophone countries. 2002. xiv, 259 pp. (incl. CD-rom).
23 Verhoeven, Ludo and Sven Strömqvist (eds.): Narrative Development in a Multilingual Context.
2001. viii, 431 pp.
22 Salaberry, M. Rafael: The Development of Past Tense Morphology in L2 Spanish. 2001. xii, 211 pp.
21 Döpke, Susanne (ed.): Cross-Linguistic Structures in Simultaneous Bilingualism. 2001. x, 258 pp.
20 Poulisse, Nanda: Slips of the Tongue. Speech errors in first and second language production. 1999.
xvi, 257 pp.
19 Amara, Muhammad Hasan: Politics and Sociolinguistic Reflexes. Palestinian border villages. 1999.
xx, 261 pp.
18 Paradis, Michel: A Neurolinguistic Theory of Bilingualism. 2004. viii, 299 pp.
17 Ellis, Rod: Learning a Second Language through Interaction. 1999. x, 285 pp.
16 Huebner, Thom and Kathryn A. Davis (eds.): Sociopolitical Perspectives on Language Policy and
Planning in the USA. With the assistance of Joseph Lo Bianco. 1999. xvi, 365 pp.
15 Pienemann, Manfred: Language Processing and Second Language Development. Processability theory.
1998. xviii, 367 pp.
14 Young, Richard and Agnes Weiyun He (eds.): Talking and Testing. Discourse approaches to the
assessment of oral proficiency. 1998. x, 395 pp.
13 Holloway, Charles E.: Dialect Death. The case of Brule Spanish. 1997. x, 220 pp.
12 Halmari, Helena: Government and Codeswitching. Explaining American Finnish. 1997. xvi, 276 pp.
11 Becker, Angelika and Mary Carroll: The Acquisition of Spatial Relations in a Second Language. In
cooperation with Jorge Giacobbe, Clive Perdue and Rémi Porquiez. 1997. xii, 212 pp.
10 Bayley, Robert and Dennis R. Preston (eds.): Second Language Acquisition and Linguistic Variation.
1996. xix, 317 pp.
9 Freed, Barbara F. (ed.): Second Language Acquisition in a Study Abroad Context. 1995. xiv, 345 pp.
8 Davis, Kathryn A.: Language Planning in Multilingual Contexts. Policies, communities, and schools in
Luxembourg. 1994. xix, 220 pp.
7 Dietrich, Rainer, Wolfgang Klein and Colette Noyau: The Acquisition of Temporality in a Second
Language. In cooperation with Josée Coenen, Beatriz Dorriots, Korrie van Helvert, Henriette Hendriks,
Et-Tayeb Houdaïfa, Clive Perdue, Sören Sjöström, Marie-Thérèse Vasseur and Kaarlo Voionmaa. 1995.
xii, 288 pp.
6 Schreuder, Robert and Bert Weltens (eds.): The Bilingual Lexicon. 1993. viii, 307 pp.
5 Klein, Wolfgang and Clive Perdue: Utterance Structure. Developing grammars again. In cooperation
with Mary Carroll, Josée Coenen, José Deulofeu, Thom Huebner and Anne Trévise. 1992. xvi, 354 pp.
4 Paulston, Christina Bratt: Linguistic Minorities in Multilingual Settings. Implications for language
policies. 1994. xi, 136 pp.
3 Döpke, Susanne: One Parent – One Language. An interactional approach. 1992. xviii, 213 pp.
2 Bot, Kees de, Ralph B. Ginsberg and Claire Kramsch (eds.): Foreign Language Research in Cross-
Cultural Perspective. 1991. xii, 275 pp.
1 Fase, Willem, Koen Jaspaert and Sjaak Kroon (eds.): Maintenance and Loss of Minority Languages.
1992. xii, 403 pp.

(Kees - de - Bot, - Ralph - B. - Ginsberg, - Claire - Kramsch) - Foreign Language Research in Cross-Cultural Perspective PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

(Kees - de - Bot, - Ralph - B. - Ginsberg, - Claire - Kramsch) - Foreign Language Research in Cross-Cultural Perspective PDF

Uploaded by

Copyright:

Available Formats

FOREIGN LANGUAGE RESEARCH IN

KEES DE BOT THOM HUEBNER

Michael Clyne (Monash University)

Kees de Bot, Ralph B. Ginsberg and Claire Kramsch (eds)

Foreign Language Research in Cross-Cultural Perspective

JOHN BENJAMINS PUBLISHING COMPANY

The paper used in this publication meets the minimum requirements of

Library of Congress Cataloging-in-Publication Data

Foreign language research in cross-cultural perspective / edited by Kees De Bot, Claire

SECTION I - PRIORITIES IN THE US AND IN EUROPE 1

SECTION II - MEASUREMENT AND RESEARCH DESIGN 33

Pros, Cons, and Limits to Quantitative Approaches in Foreign

Ask A Stupid Question...: Testing Language Proficiency in the

SECTION III -TEACHING ENVIRONMENTS 113

SECTION IV - LEARNING ENVIRONMENTS 175

Implications of Intelligent Tutoring Systems for Research and

The National Foreign Language Center was pleased to (co-)sponsor the

Kees de Bot, Claire Kramsch & Ralph B. Ginsberg

Charles A. Ferguson & Thorn Huebner

1 The language situation in the United States

Five aspects of the language situation are relevant to an understanding of

1.1 Dominance of English

1.2 FL Instruction in Public Schools

The small place for FL instruction in public education is compatible with

1.3 Language Professions

1.4 FL Instruction outside Public Education

A considerable amount of second language learning takes place outside the

1.5 Myths about Language

2 Research on second language acquisition

2,1 Research Paradigms

Over the past decade and a half, research on second-language acquisition

commitment to a single set of assumptions about questions, topics, research

2.2 Learning Contexts

classrooms, what Americans perceive as "good" in foreign languages, and how

2.2.1 Formal theories of Language

linguistic competence and communicative competence, Gregg (1989: 34-35)

Apparently, this model views the acquisition of linguistic competence as in­

2.22 Functionalist approaches to language

We have tried to present a picture of the context of SLA research in the

Theo van Els, Kees de Bot & Bert Weltens

1. the state of FLT provisions;

1 The state of FLT provisions

In order to give an impression of the European landscape as regards FLT

There is a wide diversification in the field of foreign languages in the

2 The state of empirical FLT research in Europe

In order to get an insight into past and ongoing developments in terms of

SUB-TOTAL EUROPE 9 26 36 87 158

TOTAL 15 37 45 121 218

1966-71 1972-76 1977-81 1982-87 TOTAL

SUB-TOTAL EUROPE 39 93 171 250 553

USA/Canada 33 60 96 122 311

GRAND TOTAL 73 155 278 386 892

FLT FLT SLT SLT Testing Total

Total 32 109 36 20 12 209

Of these journals we analysed the years 1981-1987. Review sections, 'Notes

Table 4. A comparison between the results of the three analyses (%).

FLT +Emp FLT/SLT +Emp

Table 5. Analysis III: Number of empirical articles per year.

3 Requirements for the near future

Skimming through the 3rd edition of the Handbook of Research on Teaching

In a volume on research in foreign language teaching and learning it would

Language teacher education programs persist in presenting classroom op­

formed of the commission of error verbally or non-verbally. Quite lengthy ex­

Apparently, this model views the acquisition of linguistic competence as in

Language teacher education programs persist in presenting classroom op

formed of the commission of error verbally or non-verbally. Quite lengthy ex

psycholinguistically relevant design features of learning environments, prefer

This is an example of a research project that accompanied a standard pro

Any language test is necessarily based on a theory of language, either impli

Tests of linguistic knowledge tend to be discrete-point item tests, and des

The difference between tests of linguistic performance and those of com

I feel it is important to stress these points because the proponents of com

In any research study it will probably be desirable to collect various differ

In the heyday of comparative investigations into language teaching metho