Spada Et Al-2010-Language Learning

Language Learning
ISSN 0023-8333
REVIEW ARTICLE
Interactions Between Type of Instruction
and Type of Language Feature:
A Meta-Analysis
Nina Spada
University of Toronto
Yasuyo Tomita
University of Toronto
A meta-analysis was conducted to investigate the effects of explicit and implicit instruction on the acquisition of simple and complex grammatical features in English.
The target features in the 41 studies contributing to the meta-analysis were categorized
as simple or complex based on the number of criteria applied to arrive at the correct
target form (Hulstijn & de Graaff, 1994). The instructional treatments were classified as
explicit or implicit following Norris and Ortega (2000). The results indicate larger effect
sizes for explicit over implicit instruction for simple and complex features. The findings also suggest that explicit instruction positively contributes to learners controlled
knowledge and spontaneous use of complex and simple forms.
Keywords instructed SLA; explicit/implicit instruction; explicit/implicit knowledge;
meta-analysis
In the instructed second language acquisition (SLA) literature there is a general

consensus that instruction is beneficial for second language (L2) development
(Ellis, 2001; Norris & Ortega, 2000; Spada, 1997). Several issues remain,
however, including what types of knowledge and language abilities benefit
We are grateful to the anonymous reviewers who provided constructive input and valuable feedback
on an earlier version of the manuscript. We are particularly grateful for the information and guidance
from one of the reviewers with particular expertise in conducting meta-analyses. Although the
article has improved considerably from the input from all reviewers, we alone are responsible for
any errors or omissions. A paper based on this research was presented at the annual meeting of the
American Association for Applied Linguistics Conference 2008, Washington, DC.
Correspondence concerning this article should be addressed to Nina Spada, Modern Language
Centre, Ontario Institute for Studies in Education, University of Toronto, 252 Bloor Street West,
Toronto, Ontario M5S 1V6, Canada. Internet: nspada@oise.utoronto.ca
Language Learning 60:2, June 2010, pp. 263308

C 2010 Language Learning Research Club, University of Michigan
DOI: 10.1111/j.1467-9922.2010.00562.x
263
Spada and Tomita
Type of Instruction and Language Feature
most from instruction and whether the benefits of instruction vary depending
on the type of language feature targeted. The last question is the focus of this
article. More specifically, we carried out a meta-analysis to investigate whether
simple and complex grammatical features benefit from explicit and implicit
instruction in similar or different ways. We also address interactions between
type of instruction and type of L2 knowledge resulting from it.
The question as to whether the effects of type of instruction vary in relation to type of language feature is of considerable interest to L2 classroom
researchers and teachers. Yet, little research has been done to directly compare
the effects of different instructional approaches on different language forms. In
the existing studies, decisions as to what language features to teach are usually
based on whether the features are considered simple or complex to describe
or easy or difficult to learn. Some researchers have claimed that whereas easy
rules can be taught, hard rules are by their very nature too complex to be successfully taught and thus difficult to learn through traditional explanation and
practice pedagogy. Hard rules are thought to be best learned implicitly, embedded in meaning-based practice (Krashen, 1982, 1994; Reber, 1989). Others
have claimed the opposite: that simple morphosyntactic rules are best learned
under implicit conditions and that the learning of complex rules is best accomplished with explicit teaching (Hulstijn & de Graaff, 1994). The argument is
that complex features are difficult to notice in naturally occurring input and
that explicit instruction is necessary in order to help learners discover complex
rules. Simple features, on the other hand, are easily available to be noticed (and
subsequently learned) via input.
Empirical work to investigate these claims has been mixed. In a laboratory
experiment with Dutch learners of a semiartificial language, de Graaff (1997)
hypothesized that there would be an interaction between simple (i.e., morphology) and complex (i.e., syntax) features with explicit and implicit instruction,
but the prediction was not supported. In other laboratory research, interactions
between type of instruction and type of language feature have been observed. In
a study investigating the effects of four conditions on the learning of a complex
rule (pseudo-clefts of location) and a simple rule (subject-verb inversion) in
English, Robinson (1996) reported that explicit instruction was more effective
for learning simple rules, but contrary to his prediction, implicit instruction
did not appear to be more effective for learning complex rules. In an experimental study of four morphological rules in an artificial language, DeKeyser
(1995) reported that categorical rules were learned much better in an explicit
condition. Support for his hypothesis that implicit instruction would work best
with prototypical rules was less clear. In the few classroom studies that have
264
Spada and Tomita
directly investigated the effects of instruction on simple and complex language

features, the results are also mixed. Williams and Evans (1998) examined the
effects of implicit (i.e., input flood) and explicit (i.e., input flood plus explicit
rule explanation) instruction on the acquisition of two features of English that
were contrasted as easy and difficult (i.e., participial adjectives and passive
voice). They found that whereas explicit instruction worked best for adjectives,
explicit and implicit instruction were equally effective for the passive voice.
In another classroom study that investigated the effects of explicit instruction
on two features of French characterized as more complex (i.e., passive) or less
complex (i.e., negation), Housen, Pierrard, and Van Daele (2005) hypothesized
that the explicit instruction would work best for the more complex feature, but
the results indicated that instruction was equally effective for both language
features.
In addition to the question as to whether type of instruction may be a
factor in the learning of different language features, there is also the issue
concerning what type of knowledge results from instruction. In the Norris
and Ortega (2000) meta-analysis of the effects of instruction in L2 learning,
advantages were found for explicit over implicit instruction. However, as the
researchers pointed out, most of the studies used discrete point or declarative
knowledge-based tests (i.e., measures of explicit knowledge) rather than tests
of fluency or spontaneous speech (i.e., measures of implicit knowledge). This
bias in instructed SLA research led Doughty (2003) to conclude that the case
for explicit instruction has been overstated (p. 274). She further argued that
until studies include more measures of implicit knowledge, we cannot be confident that instruction leads to L2 competence that is unconscious, unanalyzed,
and available for use in rapid, spontaneous communication. Since the Norris
and Ortega (2000) meta-analysis, more studies have incorporated measures of
spontaneous and unanalyzed L2 use in their research design, making it possible
to further explore this issue in the present meta-analysis.
Related to the above is the question of whether the relative difficulty of
grammatical features corresponds with implicit and explicit L2 knowledge.
As observed by many language teachers and researchers, L2 learners who
can articulate the rules for particular grammatical features may not use the
same features correctly in their spontaneous performance. Some empirical
evidence to support this is reported in Ellis (2006b) in which structures that
were discovered to be easy for learners in terms of their explicit knowledge were
difficult with respect to their implicit knowledge and vice versa. For example,
learners obtained higher scores on tests of explicit knowledge measuring the
plural -s, indefinite article and regular past tense -ed than they did on tests of
265
Spada and Tomita
implicit knowledge measuring the same features. As Ellis pointed out, these
are all features for which ready rules-of-thumb are available and which many
of the learners had probably been formally taught, [however] an easy to grasp
feature does not guarantee its accurate use as implicit knowledge (2006b,
p. 458).
This meta-analysis builds on and expands the Norris and Ortega (2000)
meta-analysis by investigating whether the effects of type of instruction vary
depending on type of language feature and whether learners possess different
types of knowledge of language features as a result of different instructional
input. The specific questions motivating the research were as follows:
1. Do the effects of explicit and implicit instruction vary with simple and
complex features in the short and long term?
2. Do the effects of explicit and implicit instruction lead to similar types of
language ability for complex and simple forms?
For the purpose of this study, we have defined explicit and implicit instruction following Norris and Ortegas (2000) definitions, which are outlined
here later. Our definition of implicit and explicit knowledge is based on Ellis (2006a); that is, explicit knowledge is held consciously, is learnable and
verbalisable, and is typically accessed through controlled processing when
learners experience some kind of linguistic difficulty in using the L2 (Ellis,
2006a, p. 95). Implicit knowledge, on the other hand, is defined as knowledge
that is procedural, is held unconsciously, and can only be verbalized if it is
made explicit (p. 95). Implicit knowledge is thought to be used in spontaneous
communication due to its rapid and easy access.1
We began our investigation by reviewing the literature on how simple and
complex features have been defined. We discovered that not only is there a lack
of consensus in terms of how simple and complex are conceptualized, but there
are also inconsistencies in the way in which they have been characterized in
empirical studies. There appear to be three main ways in which complexity has
been defined: from psycholinguistic, linguistic, and pedagogical perspectives.
In the psycholinguistic sense, complexity has been defined in terms of whether
a feature is acquired early or late or is more or less difficult to process. For
example, in the SLA literature there is evidence that learners go through a series
of predictable stages in their L2 development (Lightbown, 1980; Pienemann,
1989; Ravem, 1973; Schumann, 1979; Wode, 1976). One explanation for this
is that learners psycholinguistic processing abilities moderate their progress
through the developmental stages such that learners cannot move forward to the
next developmental stage until they are able to cognitively process the structures
266
Spada and Tomita
at earlier stages (Meisel, Clahsen, & Pienemann, 1981). Grammatical difficulty

arises when learners are expected to learn grammatical structures that they are
not developmentally ready to learn. Although some support for the benefits of
instruction targeted to learners when they are developmentally ready has been
reported (e.g., Mackey & Philp, 1998; Pienemann, 1998; Spada & Lightbown,
1999), little research has explored this issue.
Hulstijn and de Graaff (1994) also discussed complexity from a cognitive
perspective, not in terms of acquisition orders, but with respect to ease and
duration of acquisition. They argued that the degree of complexity is contingent not so much on the number of forms in a paradigm, but rather, on the
number (and/or the type) of criteria to be applied in order to arrive at the correct
form (p. 103). For example, if the formation of the plural suffix in language
X contains more steps to arrive at the correct form than language Y , then the
plural suffix can be considered a more complex form to learn in language X .
Hulstijn and de Graaff further argued that complexity interacts with other factors such as scope and reliability of the rule, semantic redundancy, and item
learning.
Complexity defined from a linguistics perspective has to do with whether,
for example, the language feature has many or few transformations, is marked
or unmarked, and is typologically similar or different from the first language.
DeKeyser (2005) claimed that there are at least three factors involved in determining linguistic difficulty: complexity of form, complexity of meaning,
and complexity of form-meaning mapping. A case in which a form can cause
difficulty would be how to use the correct morpheme in the correct place (e.g.,
plural s in English). Meaning expressed through a grammatical form can
also bring about learning difficulty due to its abstractness (e.g., articles in
English). The rules regarding their use are considered too abstract for learners to infer from the input, and explicit instruction on article use is often not
effective. According to DeKeyser (2005), problems of form-meaning mapping
are another source of grammatical difficulty when the form is semantically
unessential (e.g., third-person singular in English), when the use of the form is
nonmandatory (e.g., null subjects in Spanish and Italian), when exposure to the
target feature is limited, and when the correlation between form and meaning is
low as with the s morpheme in English, which serves several functions (e.g.,
plural of noun, third-person singular).
Other determinants of linguistic difficulty discussed in the SLA literature
include perceptibility. Doughty and Williams (1998) argued that salience of
the form in the input plays a role in determining whether the form is difficult
to learn. This is supported by Schmidts (1990, 1994, 2001) hypothesis that
267
Spada and Tomita
learners cannot learn something that they do not notice in the input. The importance of salience is also supported by a meta-analysis investigating the
factors determining the natural order of morpheme acquisition in English
(Goldschneider & DeKeyser, 2005). The learners first language (L1) has also
been identified as contributing to the difficulty of learning certain language features. For example, the distinction between avoir/etre (i.e., have/be) in French
(Harley, 1993) or in cases where learners have developed an interlanguage rule,
based on their L1, that is more general than the L2 rule (e.g., adverb placement
in English for French-speaking students) (White, 1991). Spada and Lightbown
(2008) discussed how the communicative value of a language form can also
contribute to its difficulty. For example, errors that interfere with meaning
(e.g., incorrect use of possessive pronouns his/her) may be easier to remediate
than errors that do not interfere with meaning (e.g., the absence of inversion
in questions such as What she is reading) and thus are more difficult to
learn.
From a pedagogical perspective, complexity has been described in terms of
whether a grammatical form is easy or difficult for students to understand and
to learn. Problematic grammatical features are identified, mainly by teachers,
by observing learners performance errors. The features that L2 learners fail to
systematically use accurately in their production are the features considered to
be difficult for them to learn. Asking teachers which language structures were
more or less difficult for their learners was the way in which complexity was
determined in Robinsons (1996) study described earlier.
Although these efforts to define complexity are useful, there are problems
associated with each. For example, the concern with a psycholinguistic definition of complexity based on developmental orders of acquisition is that it is
circular in nature; that is, the language feature is acquired late, therefore it must
be complex. Linguistic definitions are considered problematic because, as SLA
researchers and L2 instructors know, what is easy to describe is not necessarily
easy to learn (e.g., third-person singular s in English). The difficulty with
pedagogical descriptions of rule complexity is that although a specific rule may
be difficult for one learner, it may not be difficult for another learner and this is
likely related to such factors as aptitude and L1 background (DeKeyser, 2003).
The fact that we are far from having a generally accepted metric for distinguishing between simple and complex structures is evident in the following
quotation:
Different studies use different criteria to distinguish between simple and
complex structures. For example Krashen (1982) considers the 3rd person
268
Spada and Tomita
simple present -s marker in English as a formally simple structure

because of its paradigmatic uniqueness while Ellis (1990) classifies it as
formally complex because of the distance between the verb stem and the
noun phrase with which it agrees. Both authors agree, however, that -s is
a functionally simple structure. In contrast, DeKeyser (1998) considers
-s to be functionally complex because of its highly syncretic nature,
expressing several abstract grammatical functions simultaneously (present
time, 3rd person, singular number). De Graaff (1997) operationalizes
structure complexity as the total number of formal and functional
grammatical criteria or features which determine the specific form and
function of a given structure and which are essential for its effective
noticing and processing. Yet another approach is exemplified by
Robinsons (1996) study, where expert SLA teachers were asked to
identify from a list of grammatical structures the ones they thought to be
more difficult for their students. (Housen et al., 2005, p. 242)
In carrying out this meta-analysis it was necessary to make a choice as
to which definition of complexity we would use. Given the lack of consensus
on the conceptualization and operational definition of complexity in the L2
literature, this was not an easy decision. In the end, we decided to adopt the
criteria proposed by Hulstijn and de Graaff (1994), who, as indicated earlier,
proposed that the degree of complexity is determined by the number of criteria
to be applied in order to arrive at the correct form. Their definition relies on
derivational rules (i.e., the number of transformations required to arrive at the
target form) and they argued that these rules determine the cognitive complexity
of a language feature. This definition is similar to the transformation criteria
proposed by Celce-Murcia and Larsen-Freeman (1999) in their English as a
second language (ESL) grammar book for second/foreign language teachers.
We chose this definition because it permitted us to categorize complexity across
a wide range of different language features represented in the primary studies,
it is relatively straightforward to apply, and its conceptualization has cognitive,
linguistic, and pedagogic value. More details on how we coded the linguistic
features are provided below.
In the next section we outline the methods and procedures used in the
study. This includes descriptions of how the primary studies were searched
and selected. Also included is information about how the language features,
instructional types, and language measures were coded. This is followed by
a description of how the effect sizes were calculated and the results of the
meta-analysis.
269
Spada and Tomita
Method
Data Collection
The data for the meta-analysis were collected first through an extensive online
search of the instructed SLA literature. Norris and Ortegas (2000) examination
of the studies published between 1980 and 1998 revealed that most studies of
instructional effects on SLA were published after 1990. Therefore, only those
studies published after 1990 were included in this meta-analysis. The Education Resources Information Center (ERIC), Scholars Portal, and Linguistics &
Language Behavior Abstracts (LLBA) databases were selected as tools for the
online search, and the combinations of the following key words were utilized:
English, instruction, treatment, grammar, form, acquisition, teaching, ESL, and
EFL. An examination of the target grammar forms in these retrieved studies led
to a further search through the same online databases, adding new key words,
such as tense, past tense, possessive, determiner, article, plural, third person,
dative, passive, pseudo cleft, relative, and question.
In addition, the titles and abstracts of the following 10 online retrievable
journals were examined: Applied Linguistics, Canadian Modern Language
Review, International Journal of Applied Linguistics, International Review
of Applied Linguistics, Language and Education, Language Learning, The
Modern Language Journal, Second Language Research, Studies in Second
Language Acquisition, and TESOL Quarterly. Finally, the references of the
retrieved articles and the bibliography in Norris and Ortegas (2000) research
synthesis were consulted along with some book chapters and journal articles.
This literature search resulted in 103 published reports.
Criteria for Inclusion
The retrieved published reports were examined to determine whether they satisfied all of the following criteria for inclusion in the meta-analysis: (a) experimental or quasi-experimental design; (b) English grammar was the target form;
(c) included comparisons of treatment and control/comparison groups and/or
pretests and posttests; (d) included an instructional treatment; and (e) provided
enough statistical information for computing effect sizes. Applying these criteria resulted in 69 studies failing to meet them; thus, they were excluded from
the meta-analysis.
In some instances, the same experimental study was reported in several
publications. In cases like these, we selected only one that provided enough
statistical information to calculate effect sizes. This procedure led to the exclusion of two more studies from the meta-analysis. Furthermore, there was
one study that appeared in a bibliography but was not retrievable and another
270
Spada and Tomita
Table 1 Number of study reports and publication years

No. of study reportsa
Complex
Year of publication
20052006
20002004
19951999
19901994
Total
Simple
Total
Classroomb
Lab
Classroom
Lab
8
7
8
7
4
0
2
2
1
4
5
4
5
2
2
1
0
1
1
0
14
10
Some study reports targeted both complex and simple linguistic features.
In an earlier meta-analysis (Spada & Tomita, 2008), we examined differences between
classroom and laboratory settings (see Note 3). However, due to the small number of
effect sizes in some categories, it was decided to reduce the number of independent
variables; thus, we excluded the variable of context/setting.
b
study that was retrievable but was not readable due to the poor condition of the
PDF file. It was often the case that a single publication reported the results of
multiple comparisons, and nine studies fell into this category. In the end, a total
of 30 publications that included 41 separate studies were selected. Among these
30 publications, 10 were included in Norris and Ortegas (2000) meta-analysis
and 20 are newly added to this meta-analysis.
Table 1 shows the publication dates of the 30 studies included in the metaanalysis. Seven or eight studies examining the effects of instruction on the
acquisition of specific English grammatical features were published in every
5-year period from 1990 to 2004, whereas more recently, eight studies were
published within a 2-year period (20052006). The studies published between
1990 and 2004 were mainly laboratory studies (62.5%) and focused on complex
forms (70.8%). In contrast, among the studies published in 2005 and 2006, 90%
were conducted in classroom contexts and both complex and simple forms were
equally investigated.
As with Norris and Ortegas (2000) meta-analysis, ours examines the effects of different types of instruction on L2 learning. However our meta-analysis
is restricted to English grammar2 and investigates whether the effects of instruction differ depending on the nature of the language feature (i.e., simple or complex). Among the 41 study samples that were retrieved from 30
published studies, 17 investigated simple forms and 24 examined complex
forms. Appendix A provides more detailed information about the sample sizes,
271
Spada and Tomita
Table 2 Number of transformations: Complex and simple rules

Complex rule:
Wh-question of an object of preposition
Example: Who did you talk to?
1. Wh-replacement (You [past] talk to who)
2. Wh-fronting (Who you [past] talk to)
3. Do support (Who you [past] do talk to)
4. Subject/auxiliary inversion (Who [past]do
you talk to)
5. Affix attachment (Who [DO + past] you
talk to)
6. Morphological rules (Who did you talk to?)
7. Fronting/leaving behind (To whom did you
talk?/ Who did you talk to?)
Simple rule:
Regular past tense
Example: walked
1. [Past tense] + Verb
[Past tense] + walk = walked
instructional contexts, language features, and instructional approaches in the

30 published studies included in the meta-analysis.
Coding for Linguistic Complexity
As indicated earlier, we used the number of linguistic transformation rules as the
basis for distinguishing between simple and complex forms. Table 2 provides
an example of the number of transformations required to form a complex and a
simple rule using these criteria. For example, Wh-question as object of preposition is described as a complex feature because in order to arrive at the sample
sentence Who did you talk to? seven transformations need to be fulfilled. On
the other hand, the past tense of regular verbs is defined as simple because
only one criterion needs to be met in order to arrive at the target form: suppliance of the ed inflection. The process of coding all of the language forms in
the primary studies resulted in features that consisted of only one transformational rule and those that consisted of two or more transformations. We defined
the former as simple features and the latter as complex. With two exceptions
(yes/no questions with be-copula and wh-questions of subjects), all complex
features involved at least three or four transformations. The grammatical features characterized as simple were past tense, articles, plurals, prepositions,
subject-verb inversion, possessive determiners, and participial adjectives. The
features characterized as complex were dative alternation, question formation, relativization, passives, and pseudo-cleft sentences. These are presented
in Table 3.
272
Spada and Tomita
Table 3 Simple and complex features

Simple features
Tense
Articles
Plurals
Prepositions
Subject-verb inversion
Possessive determiners
Participial adjectives
Complex features
Dative alternation
Question formation
Relativization
Passives
Pseudo-cleft sentences
Coding for Instruction

Type of instruction was coded following Norris and Ortega (2000) in which
instruction was considered to be explicit if rule explanation comprised part of
the instruction or if learners were directly asked to attend to particular forms
and to try to arrive at metalinguistic generalizations on their own (p. 437).
Therefore, we coded instruction as explicit when it included grammar rule explanation (e.g., Benati, 2005: Master, 1994), L1/L2 contrasts (e.g., Ammar &
Lightbown, 2005; Spada, Lightbown, & White, 2005; White & Ranta, 2002),
and metalinguistic feedback (e.g., Carroll & Swain, 1993; Ellis, Loewen, &
Erlam, 2006). Instruction was coded as implicit if neither rule presentation
nor directions to attend to particular forms were part of a treatment (Norris
& Ortega, 2000, p. 437). Some examples of implicit instruction include input flood/high-frequency input (e.g., Spada & Lightbown, 1999; Williams &
Evans, 1998), interaction (e.g., Mackey, 1999), and recasts (e.g., Ellis et al.,
2006; Han, 2002; Mackey & Oliver, 2002). Interrater reliability was calculated
using Cohens kappa (), which reflects both agreement and error. The interrater
reliability for coding instructional type was high at = .94. Length of instructional treatment, context (i.e., classroom and laboratory), and proficiency level
were also coded.3
Coding for Type of Knowledge
The outcome measures were coded according to whether they were controlled
or free constructed responses. This enabled us to investigate the claim that explicit instruction leads to monitored analyzed knowledge and implicit instruction is more likely to lead to unanalyzed, spontaneous knowledge (Krashen,
1982; Schwartz, 1993; Truscott, 1999). In their meta-analysis, Norris and
Ortega (2000) categorized outcome measures into four types: (a) metalinguistic judgments; (b) selected responses; (c) constrained constructed responses;
273
Spada and Tomita
and (d) free constructed responses. Metalinguistic judgments require learners to

judge the grammaticality of a sentence, whereas selected responses ask learners
to select a correct answer in the form of multiple-choice questions. Constrained
constructed responses ask learners to produce language varying from one word
to a complete sentence. Free constructed responses are unrestricted in terms of
form and have meaning as the primary goal.
Due to a relatively small number of studies in our meta-analysis, we decided to collapse Norris and Ortegas (2000) four outcome measure categories
into two types: controlled and free constructed tasks. Our controlled task category consists of metalinguistic judgments, selected responses, and constrained
constructed responses. Some examples of controlled tasks used in the primary studies examined in this meta-analysis include grammaticality judgment
tests (e.g., Ellis et al., 2006; Robinson, 1996, 1997a, 1997b), multiple-choice
questions (e.g., Chan, 2006; Master, 1994), sentence combination tasks (e.g.,
Ammar & Lightbown, 2005), and scrambled sentence tasks (e.g., Spada &
Lightbown, 1999; White, Spada, Lightbown, & Ranta, 1991). Our free constructed task category is defined and operationalized in the same way as Norris
and Ortega. Some examples of these from the primary studies include: free
writing tasks (e.g., Bitchener, Young, & Cameron, 2005; Spada et al., 2005),
picture description tasks (e.g., Izumi & Lakshmanan, 1998; Mackey, 2006),
and picture-cued information gap tasks (e.g., Mackey, 1999, 2006; Mackey
& Oliver, 2002). All measures in each primary study were coded either as a
controlled or a free task by two coders and the interrater reliability was high at
= .90.
The overall characteristics of the published study reports (n = 30) are shown
in Table 4. Although all studies reported sample sizes, several studies did not
provide exact information about the testing schedule (i.e., when the pretest
and immediate/delayed posttest took place). Our examination of the narrative
descriptions in these sample studies led us to code within one week after the
Table 4 Overall study characteristics
Characteristics
Mean
SD
Min
Max
Mode(s)
na
Sample size
Immediate posttest (days)
First delayed posttest (weeks)
Second delayed posttest (weeks)
58.66
4.89
4.00
5
37.533
7.52
3.62
2.31
8
0
1
3
160
28
16
7
34
0
2
3
41
41
17
4
Number of sample studies contributing to the meta-analysis. A study report may include
multiple sample studies.
274
Spada and Tomita
treatment as 3.5 days (n = 10) and on the first day after the treatment as
2 days (n = 2). Although all studies included immediate posttests, only half of
them included delayed posttests. In four cases there were two delayed posttests.
This information is detailed in Appendix B. Additional information about the
study reports, including descriptive statistics for learner and study characteristics (e.g., proficiency and educational levels of learners, study designs, and
contexts), are provided in Appendix C.
Meta-Analysis
To investigate the effects of instruction on the acquisition of simple and complex forms, we calculated effect sizes for each study. There are several types of
effect size, such as Cohens d, Hedges g, Pearsons r, and Glasss delta (Lipsey
& Wilson, 2001).4 We calculated Cohens d for each study in this meta-analysis.
Because most of the collected studies did not report effect sizes, it was necessary for us to calculate them based on the original data and we did this in
one of four ways (see Appendix D for the different formulas used to calculate
Cohens d). Cohens d was then used to examine between- and within-group
comparisons.5 First, an effect size was calculated by comparing treatment and
control/comparison groups at the immediate posttest to investigate the effects
of type of instruction (explicit/implicit) in relation to linguistic features (simple/complex) and outcome measures (controlled/free).6 Second, to examine the
durability of instruction, effect sizes were calculated for delayed posttests (first
and second) by comparing treatment and control/comparison groups. However,
due to a small number of studies that administered second delayed posttests,
it was not possible to include them in this meta-analysis. Third, to examine
the effects of instruction observed within each group, the immediate posttest
scores and pretests scores were compared within the same group to compute
the effect size. This included control and comparison groups, permitting us to
investigate learners progress within each.
We calculated one effect size for each treatment group by averaging all effect
sizes gained from dependent variables for the group. Because a central question
motivating this research was whether instructional effectiveness is likely to vary
depending on the grammatical feature, we considered each grammatical form
as an independent variable. Such multiple effect sizes based on the same sample
are considered nonindependent observations (Lipsey & Wilson, 2001; Norris &
Ortega, 2006). However, as Keck, Iberri-Shea, Tracy-Ventura, and Wa-Mbaleka
(2006) suggested, as long as descriptive statistics (i.e., no inferential testing)
are carried out, the nonindependence observation does not present a problem
for the meta-analysis.
275
Spada and Tomita
In conducting meta-analyses, the standardized mean difference effect size

tends to be biased upwardly when the sample size is small (Hedges, 1981).
Furthermore, effect sizes derived from larger sample sizes play a larger role in
statistical analyses because they have less sampling error than the effect sizes
from small sample sizes (Lipsey & Wilson, 2001). Thus, before comparing
effect sizes, we calculated unbiased and weighted effect sizes for each treatment
group (see Appendix D for the formula).
Following Rosenthals (1991, 1995) procedures, we conducted one-way
t tests on the unweighted effect sizes to see whether instruction was significantly
effective. We also graphically examined the distribution normality and searched
for unexpected moderator variables by running the homogeneity of variance
test. We then calculated a fail-safe N to investigate the file-drawer problem
or the number of unpublished studies with a null result that would be needed to
reverse the results of our meta-analysis (Hsu, 2002; Orwin, 1983).7 Rosenthal
(1979, 1991) developed a formula to do this that estimates the number of filedrawer studies with an average validity coefficient of zero that would be needed
to overturn the rejection of the null hypothesis. These results along with the
main findings of the study are reported next.
Results
Table 5 presents the unweighted, unbiased effect sizes of the 30 studies at the
immediate and at the first delayed posttest. The unweighted effect sizes ranged
from 0.7 to 5.50. Table 6 presents the results of one-sample t tests conducted
for the four groups: (a) explicit instruction and simple linguistic features; (b)
explicit instruction and complex linguistic features; (c) implicit instruction and
simple linguistic features; and (d) implicit instruction and complex linguistic
features. The results indicate that although the effects of explicit instruction
were statistically significant for both linguistic features, the effects of implicit
instruction were significant for complex but not simple features.
The results of the homogeneity of variance test presented in Table 6 indicate
that three out of four groups obtained statistical significance, meaning that
differences within each group are not likely the result of sampling errors but
rather other moderating factors. Norris and Ortega (2006) emphasized the
importance of checking the normal distribution graphically rather than simply
depending on the homogeneity tests because sample sizes in applied linguistics
are usually small and the homogeneity test is likely to turn out to be significant.
Thus, we verified the distributions of effect sizes within the four groups and
these are graphically represented in Table 7. The stem and leaf plots display
276
Spada and Tomita
Table 5 Effect sizes of individual treatment groups

Effect sizes (d)
Treatment groupsa
Study
Ammar & Lightbown (2005)
Benati (2005)
Study 1
Study 2
DOG (E)
OPG (E)
SDO (E)
PI (E)
MOI (E)
PI (E)
MOI (E)
Immediate
posttest
Delayed
posttest
5.50
3.77
4.21
2.80
0.59
0.76
0.09
Bitchener et al. (2005)

Study 1
Study 2
Study 3
Chan (2006)
Ellis et al. (2006)
Mackey (2006)
Master (1994)
Muranoi (2000)
Study 1
Study 2
Study 3
Study 1
Study 2
Spada & Lightbown (1999)

Spada et al. (2005)
Study 1
Study 2
Takashima & Ellis (1999)
White & Ranta (2002)
White et al. (1991)
CF + Conference (E)
CF (E)
CF + Conference (E)
CF (E)
CF + Conference (E)
CF (E)
Explicit (E)
Implicit (I)
Explicit (E)
Implicit (I)
Implicit (I)
Implicit (I)
Treatment (E)
IE/F (E)
IE/M (I)
IE/F (E)
IE/M (I)
Implicit (I)
Q/CA (E)
Q/NCA (E)
PD/CA (E)
PD/NCA (E)
Experiment (I)
Rule (E)
0.86
0.16
0.75
0.36
1.72
0.63
0.50
0.32
0.79
0.75
0.18
0.45
0.65
1.78
0.57
0.94
0.12
0.11
1.70
1.27
0.86
0.93
0.70
0.57
0.91
0.23
1.43
0.97
0.49
0.93
0.34
0.71
(Continued)
277
Spada and Tomita
Table 5 Continued
Effect sizes (d)
Treatment groupsa
Study
Williams & Evans (1998)
Study 1
Study 2
Study 1
Study 2
Yip (1994)
Carroll & Swain (1993)
Doughty (1991)
Fotos & Ellis (1991)
Study 1
Study 2
Han (2002)
Izumi (2002)
Izumi & Izumi (2004)

Izumi & Lakshmanan (1998)
Kubota (1994)
Kuiken & Vedder (2002)

Mackey (1999)
Mackey & Oliver (2002)
Explicit (E)
Explicit (E)
Flood (I)
Explicit (E)
Flood (I)
Explicit (E)
CR (E)
Meta (E)
Wrong (E)
Implicit (I)
Sure (I)
MOG (I)
ROG (E)
Explicit Task (E)
Explicit Grammar (E)
Explicit Task (E)
Explicit Grammar (E)
Recast (I)
O/IE (I)
O/-IE (I)
-O/IE (I)
-O/-IE (I)
Output (I)
Non-Output (I)
Experiiment (E)
Rule (E)
Wrong (E)
Answer (I)
Sure (I)
Interaction (I)
Interactors (I)
Unreadies (I)
Observers (I)
Scripted (I)
Implicit (I)
Immediate
posttest
1.35
1.16
1.55
1.27
0.02
1.52
1.03
1.54
1.11
1.37
1.04
0.42
0.06
2.26
2.52
1.17
2.19
1.16
0.75
0.51
0.05
0.15
0.31
0.69
1.27
0.02
0.10
0.31
0.17
0.75
1.16
1.51
0.88
0.06
0.92
Delayed
posttest
1.21
1.65
0.95
1.08
1.08
1.17
1.30
2.28
1.57
1.6
0.43
0.30
0.41
0.17
0.04
1.13
(Continued)
278
Spada and Tomita
Table 5 Continued
Effect sizes (d)
Treatment groupsa
Study
Mackey & Philp (1998)
Interactor Ready (I)

Interactor Unready (I)
Recast Ready (I)
Recast Unready (I)
McDonough (2005)
Enhanced Opportunity (I)
Opportunity (I)
Feedback (I)
Robinson (1996, 1997b) Study 1 Implicit (I)
Rule-Search (E)
Instructed (E)
Study 2 Implicit (I)
Rule-Search (E)
Instructed (E)
Robinson (1997a)
Implicit (I)
Enhanced (E)
Instructed (E)
Immediate
posttest
0.78
0.81
0.70
0.59
1.00
0.48
0.00
0.06
0.14
0.43
0.06
0.00
0.55
0.36
0.24
0.14
Delayed
posttest
1.06
1.83
0.47
0.61
Implicit and explicit groups are represented as I and E, respectively. Treatment groups
in some studies are treated as control groups.
Table 6 One-sample t test, homogeneity test, and fail-safe N

One-Sample t test
Homogeneity test
Fail-safe N a
Explicit-simple
t(19) = 5.01, p < .001
N = 34.45 (>6.19)
Explicit-complex
t(23) = 4.98, p < .001
Implicit-simple
t(8) = 2.15, p = .064
Implicit-complex
t(28) = 5.71, p < .001
2 (19) = 65.48,
p < .01
2 (23) = 146.78,
p < .01
2 (8) = 15.56,
p > .01
2 (28) = 50.99,
p < .01
N = 11.4 (>7.14)
N = 5.85 (>4.28)
N = 14.3 (>7.14)
Criteria of fail-safe Ns (FSN) are represented in parenthesis. The criteria were calculated based on the ratio of FSN and sample studies as described in Lipsey and Wilson
(2001, p. 166).
279
Spada and Tomita
Table 7 Stem and leaf of effect sizes
Explicit-simple
Explicit-complex
Implicit-simple
Implicit-complex
Frequency
Stem
Leaf
2.00
2.00
12.00
3.00
1.00
2.00
6.00
8.00
2.00
2.00
1.00
1.00
1.00
1.00
2.00
3.00
2.00
2.00
1.00
3.00
9.00
10.00
4.00
2.00
0.
0.
0.
1.
2.
0.
0.
1.
1.
2.
2.
3.
4.
5.
0.
0.
0.
1.
0.
0.
0.
0.
1.
1.
13
00
555667778899
577
8
01
001244
01112223
56
12
5
7
2
5
04
013
56
11
7
001
001133344
5577777889
0013
55
relatively normal distributions among all groups except for the explicit-complex
group, which had three extremely large effect sizes (d = 3.7, 4.2, and 5.5), all
of which were from a single study (i.e., Ammar & Lightbown, 2005). With
relatively normal distributions for three of the four groups, we felt confident
continuing with the meta-analysis, keeping in mind the large effect sizes from
this one study.
The results of the fail-safe N test indicate that the four groups are well
above the criteria that are indicated in the parentheses in Table 6. This means
that unpublished studies reporting nonsignificant findings are unlikely to reverse
the findings (see Lipsey & Wilson, 2001; Orwin, 1983; Rosenthal, 1991, 1979,
for more information on the fail-safe N test and how to calculate it). Hereafter,
280
Spada and Tomita
Table 8 Type of instruction and language feature
Explicit instruction
Complex forms
Simple forms
Implicit instruction
Complex forms
Simple forms
na
kb
Mean d
(weighted)
SE
95% CI
lower
95% CI
upper
15
13
24
20
0.88
0.73
0.06
0.08
0.76
0.58
1.00
0.88
15
9
29
9
0.39
0.33
0.07
0.11
0.26
0.11
0.51
0.55
b
Number of treatment groups contributing to the meta-analysis.
Effect size (d)
1.5
0.5
0
Explicit-Complex
Explicit-Simple
Implicit-Complex
Implicit-Simple
Instructional Treatment
Figure 1 Mean effect sizes and confidence intervals: Types of instruction.
we use d to refer to the weighted mean effect size based on unbiased effect
sizes in presenting the main findings of the meta-analysis.
Table 8 and Figure 1 display the mean effect sizes and confidence intervals
for explicit and implicit instruction on the acquisition of complex and simple
language features. The effect sizes for explicit instruction are larger than those
for implicit instruction.8 The largest effect size is for explicit instruction for
complex forms (d = 0.88). Excluding the three study samples with unusually
high effect sizes also resulted in a large effect size (d = 0.83). The effect
sizes for implicit instruction are small, with d = 0.39 for complex forms and
d = 0.33 for simple forms. As illustrated in Figure 1, explicit and implicit
instructions do not overlap with each other, indicating that these differences are
reliable.9 Following Rosenthal (1991, 1995), we further examined the binominal
281
Spada and Tomita
Table 9 Binominal effect size display

Success
Failure
Explicit-simple
Control
67
33
33
67
Explicit-complex
Control
70
30
30
70
Implicit-simple
Control
58
42
42
58
Implicit-complex
Control
60
40
40
60
Table 10 Effect sizes: Pretest to posttest
Complex forms
Simple forms
Complex forms
Simple forms
Control group
Complex forms
Simple forms
na
kb
Mean d
(weighted)
SE
95% CI
lower
95% CI
upper
11
12
16
18
0.84
0.88
0.07
0.08
0.70
0.72
0.98
1.05
9
5
18
5
0.29
0.66
0.08
0.16
0.15
0.35
0.45
0.96
9
12
9
12
0.04
0.28
0.12
0.10
0.20
0.09
0.27
0.47
b
effect size to determine what these effect sizes practically mean. As Table 9
shows, approximately 70% of participants receiving explicit instruction on
complex or simple forms showed success or improvement, whereas about only
30% of participants in control groups showed success when learning complex
or simple forms. Of those receiving implicit instruction, approximately 60%
showed improvement on simple and complex forms, whereas about 40% in the
control groups improved on both forms.
Table 10 and Figure 2 display the effect sizes representing the magnitude of
change within groups from pretest to posttest along with the 95% confidence
intervals. A similar pattern exists, in that explicit instruction has larger effect
sizes than implicit instruction for both complex and simple forms. Figure 2
282
Spada and Tomita
1.5
Effect size (d)
1
0.5
0
-0.5
ExplicitComplex
ExplicitSimple
ImplicitComplex
ImplicitSimple
ControlComplex
ControlSimple
-1
Figure 2 Mean effect sizes and confidence intervals: Pretest and posttest.
reveals that the effect sizes of explicit instruction rarely overlap with implicit
or control groups, indicating that the effect sizes for explicit instruction are
trustworthy. The only exception is a slight overlap between implicit and explicit
instruction on simple forms. Figure 2 also reveals positive effects for implicit
instruction; however, the effect sizes are small (d = 0.29) or medium (d = 0.66)
for both complex and simple forms. Similar to Norris and Ortega (2000), our
results show evidence of some growth in the control group. This is particularly
noticeable with learning simple forms (d = 0.28), which may be a reflection of
the natural maturation of simple forms.
Overall these results indicate a more positive role for explicit instruction
and, furthermore, suggest that explicit instruction works better than implicit
instruction for both simple and complex features. However, this analysis does
not take into account the effects of instruction over time. These results are
presented below.
Table 11 presents mean effect sizes and 95% confidence intervals for the
first delayed posttest. Not all of the primary studies reported the results of
delayed posttests with sufficient statistical information to calculate effect sizes.
Nonetheless, of those studies that did, the effect sizes for explicit instruction
on both complex and simple forms are once again larger than those for implicit
instruction. Figure 3 displays the effect sizes at two different times: immediate
and first delayed posttests. All of the four explicit and implicit groups increase
at the first delayed posttest.
Table 12 and Figure 4 present the mean effect sizes and 95% confidence
intervals for types of outcome measures (i.e., controlled and free). As expected,
controlled outcome measures yield relatively large effect sizes after explicit
instruction (d = 0.84 for complex features and d = 0.78 for simple features).10
283
Spada and Tomita
Table 11 Effect sizes: First delayed posttests
Complex forms
Simple forms
Complex forms
Simple forms
na
kb
Mean d
(weighted)
SE
95% CI
lower
95% CI
upper
7
3
10
3
1.02
1.01
0.06
0.18
0.90
0.67
1.15
1.36
5
5
10
5
0.56
0.51
0.13
0.14
0.32
0.24
0.81
0.79
b
Effect Size (d)
1.5
1
Immediate
Delayed
0.5
0
Explicit-Complex Explicit-Simple Implicit-Complex Implicit-Simple
Figure 3 Mean effect sizes of posttests: Immediate and first delayed posttests.
However, the largest effect size across the four groups is for free outcome
measures after explicit instruction on complex features (d = 0.86). In addition,
the effect sizes for free outcome measures after explicit instruction with simple
features yielded a medium effect size (d = 0.63). These results are rather
surprising and are discussed below.
Regarding the effects of implicit instruction on different outcomes measures, the effect sizes were small overall with only one medium effect size for
free measures on simple forms (d = 0.56), as shown in Table 12 and Figure 5. It
is important to note, however, that the confidence intervals for each instruction
group presented in Figures 4 and 5 indicate considerable overlap, suggesting
that the differences are not as trustworthy as they could be. This may be due
to the small number of sample studies. For example, only four sample studies
contributed to the largest effect size for free outcome measures on complex
284
Spada and Tomita
Table 12 Effect sizes for controlled and free outcome measures
Complex forms
Free
Control
Simple forms
Free
Control
Complex forms
Free
Control
Simple forms
Free
Controlled
na
kb
Mean d
(weighted)c
SE
95% CI
lower
95% CI
upper
4
13
5
21
0.86
0.84
0.14
0.07
0.58
0.71
1.13
0.97
5
8
9
12
0.63
0.78
0.11
0.11
0.41
0.57
0.85
1.00
7
10
15
16
0.23
0.34
0.09
0.07
0.06
0.20
0.39
0.49
4
2
4
2
0.56
0.17
0.19
0.30
0.19
0.42
0.92
0.77
b
c
Some studies combined the scores on free and controlled tasks. These scores are not
included in this table. Only those studies that reported separate scores for free and
controlled tasks contributed to the meta-analysis.
Effect size (d)
1.5
0.5
0
Complex-Control
Complex-Free
Simple-Control
Simple-Free
Instructional Treatment: Explicit
Figure 4 Mean effect sizes and confidence intervals for explicit instruction: Controlled
and free outcome measures.
285
Spada and Tomita
1.5
Effect size (d)
1
0.5
0
Complex-Control
Complex-Free
Simple-Control
Simple-Free
-0.5
-1
Instructional Treatment: Implicit Instruction
Figure 5 Mean effect sizes and confidence intervals for implicit instruction: Controlled
and free outcome measures.
forms after explicit instruction. It may also be due to the possibility that the
controlled and free outcome measures are not sufficiently distinct in the type
of L2 knowledge they tap. This will be discussed below along with the other
findings.
Discussion and Conclusion

The results of this meta-analysis are discussed with reference to the two research
questions outlined earlier, beginning with the first question: Do the effects of
explicit and implicit instruction vary with simple and complex features in the
short and long term? The findings indicate that the effect sizes for explicit
instruction on simple and complex features are consistently larger than those
for implicit instruction. The results also indicate that the effects of explicit
and implicit instruction increased on the delayed posttest and that explicit
instruction yielded larger effect sizes than implicit instruction. The overall
advantages for explicit over implicit instruction are consistent with the results
of Norris and Ortega (2000). The finding that explicit instruction was more
beneficial for complex and simple forms is consistent with Robinson (1996),
who reported benefits for explicit instruction with complex features, and with
DeKeyser (1995) and Williams and Evans (1998), who reported advantages of
explicit instruction for simple features (although not with complex features).
The results are also similar to other research in which explicit instruction has
been found to be equally beneficial for simple and complex features (de Graaff,
1997; Housen et al., 2005).
286
Spada and Tomita
With regard to the second questionDo the effects of explicit and implicit
instruction lead to similar types of language ability for complex and simple
forms?the results of this meta-analysis do not permit a clear answer and
this is mainly due to the small number of studies in some of the categories.
Nonetheless, the effect sizes were medium or large for performance on both
controlled and free measures after explicit instruction, with the largest effect
size obtained for knowledge of complex forms measured by free constructed
response measures. These findings seem to contradict the noninterface position
and claim that explicit instruction does not result in unconscious, unanalyzed
knowledge available for use in spontaneous communication (e.g., Krashen,
1982; Schwartz, 1993; Truscott, 1999). Indeed, the positive effects of explicit
instruction on measures of spontaneous L2 production could be interpreted
as support for the strong interface position and the argument that declarative
(i.e., explicit) knowledge obtained via explicit instruction can be converted into
procedural (i.e., implicit) knowledge with practice (DeKeyser, 1998; Hulstijn,
1995). One might argue, however, that the measures characterized as free in
the primary studies examined in this meta-analysis may not represent pure
measures of spontaneous ability tapping into exclusively implicit knowledge.
For example, in several of the primary studies included in this meta-analysis,
picture-cued oral performance tasks were used to measure L2 progress. During
such tasks, learners are typically asked questions about the pictures to which
they respond quickly without time for planning. The assumption behind the
use of such tasks is that no planning time will prevent the activation of explicit
knowledge and will trigger implicit knowledge. However, there are concerns
about whether tasks such as these actually serve as a measure of automatized
explicit knowledge rather than implicit knowledge (Ellis, 2002; Housen et al.,
2005). Thus, one cannot be certain that the oral production tasks used in
the primary studies for this meta-analysis are indeed measures of implicit
knowledge. Not only do we need more studies to examine the contributions of
instruction to different types of L2 knowledge, perhaps more importantly is the
need for validation studies of measures of implicit (and explicit) knowledge
(Ellis, 2005).
The primary studies investigating the effects of implicit instruction on
controlled and free outcome measures revealed small and medium effect sizes
for both simple and complex features. Thus, regardless of linguistic complexity,
implicit instruction does not appear to have as significant an effect as explicit
instruction. This may be because implicit instruction takes a longer time to be
effective and none of the studies in this meta-analysis included more than 10 hr
of instruction. However, implicit instruction did yield a medium effect size for
287
Spada and Tomita
free tasks examining learners use of simple forms (see Table 12). Nonetheless,
the smaller effect sizes for implicit instruction on free outcome measures do
not support the claim that the benefits of implicit instruction are more likely to
be reflected on measures that tap into learners unconscious, spontaneous L2
knowledge.
In Norris and Ortegas (2000) meta-analysis, only 16% of the studies used
free outcome measures. This led them to question whether the overall benefits
they observed for explicit instruction were a result of the fact that most of the
language measures were measuring explicit knowledge. In Doughtys (2003)
reanalysis of those data, she confirmed the bias and reiterated the call for the
use of more measures of implicit knowledge in instructed SLA research. The
studies included in this meta-analysis were published as recently as 2006, and
50% of them utilized free outcome measures. When we further examine the
types of outcome measures closely, there are 69 individual tasks included in
the 30 published studies. Among these tasks the most frequently used were
metalinguistic judgments (33.3%) and free constructed tasks (33.3%), followed by constrained constructed responses (23.2%) and selected responses
(10.1%) (see the coding for type of knowledge in the methodology section). In
another recent meta-analysis of interaction-based SLA research, Mackey and
Goo (2007) reported that 52% of the outcome measures used in the studies was
open-ended production measures (e.g., oral production tasks and writing tests).
Thus, it would appear that SLA researchers are responding to the call for more
measures of spontaneous, unanalyzed (i.e., implicit) knowledge and use.
Related to the above is the fact that more studies in Norris and Ortegas
(2000) meta-analysis investigated the effects of explicit rather than implicit
instruction. Out of a total of 45 studies examined in their research, 37 (82%)
included at least one explicit instructional group compared with 21 (46%) that
had at least one implicit group. When we examined how many of the 30 studies
included in this meta-analysis investigated the effects of implicit instruction
on L2 learning, we found that 19 (63%) of them focused on implicit teaching
either exclusively (n = 11 studies) or in contrast with explicit instruction (n =
8 studies). Among these 19 studies, only 7 were included in the Norris and
Ortega meta-analysis because the remaining 12 were published after 1998,
their cutoff point. The same 12 studies were included in this meta-analysis and
nearly 70% of them used free outcome measures to examine learners progress.
This indicates that not only has there been an increase in the number of studies
investigating the effects of implicit instruction on L2 learning but that this is
likely related to the development and use of more measures of implicit L2
knowledge in instructed SLA research.
288
Spada and Tomita
The overall findings of this meta-analysis indicate that explicit instruction is

more effective than implicit instruction for both simple and complex features.
Thus, these findings do not appear to support the hypothesis that type of language feature interacts with type of instruction. However, an important caveat
is in order, which is related to the way in which complex and simple features
were defined (i.e., number of linguistic transformations). As indicated in the
introduction, there are many ways of defining complexity. In fact, an examination of whether and how the investigators of the primary studies included in
this meta-analysis described complex/simple or difficult/easy features revealed
at least eight different categories. This included developmental stage, L1/L2
differences, form-meaning relationships, learnability, teachers perceptions of
learner difficulty, the lexical preference principle, structure complexity, and
typological markedness.
We acknowledge that if we had chosen to use a different set of criteria to distinguish the two types of language features, the results may have
been different.11 For example, the use of psycholinguistic criteria would have
avoided the problematic categorization of articles as simple language features
indicated in Table 3. Every language teacher knows that articles in English
pose persistent problems for second/foreign learners even at very advanced
levels. This is related to the semantic complexity of articles and to their lack
of perceptual salience in the input. Indeed, in the Goldschneider and DeKeyser
(2005) meta-analysis referred to earlier, they concluded that salience was a
common denominator for a wide range of complexity factors examined in their
study (e.g., morphophonological regularity, syntactic category, frequency, and
semantic complexity).
Whether one works with linguistic or psycholinguistic criteria, they are
both objective measures of complexity and do not take into consideration
characteristics of the learner. Complexity is also
an individual issue that can be described as the ratio of the rules inherent
linguistic complexity to the students ability to handle such a rule. What is
a rule of moderate difficulty for one student may be easy for a student with
more language learning aptitude or language learning experience.
(DeKeyser, 2003, p. 331)
Thus, the subjective difficulty of the language feature adds another complexity to investigating the effects of different types of instruction on simple and
complex features.12
An intriguing finding from this research is that whereas explicit instruction
was found to be superior in contributing to learners explicit knowledge of
289
Spada and Tomita
complex and simple forms, it also appears to have contributed to their ability
to use these features in unanalyzed and spontaneous ways. Although implicit
instruction was not found to be as effective as explicit instruction, implicit
instruction did consistently lead to reliable gains of small or medium effect
sizes. To be sure, more research is needed to be more confident of these
findings, specifically more studies investigating the effects of implicit and
explicit instruction on different types of L2 knowledge. There is also a need
for more studies to include delayed posttests in their design, making it possible
to observe changes in L2 learners knowledge over time. Continued research
in instructed SLA will undoubtedly be of great help in future meta-analyses of
the effects of type of instruction on the learning of specific L2 features.
Revised version accepted 18 February 2009
Notes
1 One of the anonymous reviewers suggested that we not use the terms explicit and
implicit because they are difficult to define and considerable disagreement exists in
the literature regarding their conceptualization and use. We have tried to reduce the
ambiguity by defining our use of explicit and implicit L2 knowledge and by
maintaining consistency in our reference to them.
2 We chose to focus only on English because we (not the original researchers)
categorized the target features in the primary studies as simple or complex. We felt
that a more advanced level of grammatical knowledge was required in order to
make decisions about the number of transformations required to arrive at the
correct target form in other languages.
3 In an earlier meta-analysis (see Spada & Tomita, 2008), we included information
about length of instructional treatment, context (laboratory/classroom), and
proficiency level. However, the small number of studies in some of the comparisons
weakened the results and made interpretations difficult.
4 These effect sizes can be transformed from one to the other (Norris & Ortega,
2006; Rosenthal, Rosnow, & Rubin, 2000; Vacha-Haase & Thompson, 2004).
5 There are different effect size formulas for between- and within-group comparisons
(see Lipsey & Wilson, 2001, for the formulas and discussion). The formula for
within-group comparisons requires detailed statistical information. However, most
of the primary studies in our meta-analysis did not report sufficient statistical
information to compute effect sizes for the within-group comparisons. Thus,
following Norris and Ortega (2000) and Keck et al. (2006), we used the formula for
between-group comparisons consistently throughout our meta-analysis.
6 There were many occasions when studies did not compare instructional treatment
groups with a control/comparison groups. In spite of the tendency to have
statistically biased effect sizes due to lack of control groups (Norris & Ortega,
2006), we included as many published SLA studies as possible in order to examine
290
Spada and Tomita
10
11
12
the widest range of instructional types and linguistic features in our meta-analysis.
Thus, following Norris and Ortegas (2000) procedures, we calculated effect sizes
in two additional ways: (a) When a study did not include a control or comparison
group, we chose one of the instructional treatment groups as the baseline
comparison condition (p. 446) and calculated effect sizes by comparing the
baseline comparison condition and the other instructional treatment groups; and (b)
when a study did not include a control or comparison group but reported pretest
and posttest results, we calculated effect sizes based on the pretest and posttest
performance. These procedures also applied to the effect size calculation for
delayed posttests.
We are grateful to one of the anonymous reviewers for pointing out the fail-safe N
test and the homogeneity test to establish greater confidence in the results of the
meta-analysis. To our knowledge, these tests have not been used in other
meta-analyses published in the field of SLA.
The general guidelines for interpreting effect sizes are that those between 0.2 and
0.5 are small effects, those between 0.5 and 0.8 are medium-sized effects, and those
greater than 0.8 are considered large effects (Lipsey & Wilson, 2001).
In the figures, the top and bottom of the box indicate the upper and lower 95%
confidence intervals. Crossing the zero-value line means that the results may be by
chance and, therefore, not trustworthy. When the boxes do not overlap with each
other, the differences are reliable.
Without the outliers (i.e., three comparisons in the Ammar & Lightbown, 2005,
study), the effect size was (d = 0.78) for complex features. They contributed only
to the effect sizes for controlled measures not for free constructed tasks.
One of the anonymous reviewers suggested that we calculate effect size
comparisons using different classifications of simple and complex forms. Although
it would be a useful undertaking, it is beyond the goals of this study.
Nonetheless, the fact that some students find some structures easier than other
students does not rule out psycholinguistic or pedagogic definitions of complexity.
As one of the anonymous reviewers pointed out and with which we agree, the same
structures are probably still more difficult on average for all students, particularly if
they share the same first language.
References
Alanen, R. (1995). Input enhancement and rule presentation in second language

acquisition. In R. Schmidt (Ed.), Attention and awareness in foreign language
References with one asterisk are the 103 study reports that were retrieved through the literature
search. References with two asterisks are the 30 published study reports that were included in this
meta-analysis.
291
Spada and Tomita
learning and teaching (Tech. Rep. No. 9; pp. 259302). Honolulu: University of
Hawaii, Second Language Teaching & Curriculum Center.
Ammar, A., & Lightbown, P. M. (2005). Teaching marked linguistic structure: More
about the acquisition of relative clauses by Arab learners of English. In A. Housen
& M. Pierrard (Eds.), Investigation in instructed second language acquisition (pp.
167198). Amsterdam: Mouton de Gruyter.
Bardovi-Harlig, K. (1994). Reverse-order reports and the acquisition of tense: Beyond

the principle of chronological order. Language Learning, 44, 243282.
Bardovi-Harlig, K. (1995). A narrative perspective on the development of the

tense/aspect system in second language acquisition. Studies in Second Language
Acquisition, 17, 263291.
Bardovi-Harlig, K. (1997). Another piece of the puzzle: The emergence of the present
perfect. Language Learning, 47, 375422.
Bardovi-Harlig, K. (1998). Narrative structure and lexical aspect: Conspiring factors

in second language acquisition of tense-aspect morphology. Studies in Second
Language Acquisition, 20, 471508.
Bardovi-Harlig, K. (1999). From morpheme studies to temporal semantics: Tenseaspect research in SLA. Studies in Second Language Acquisition, 21, 341382.
Bardovi-Harlig, K. (2000). Tense and aspect in second language acquisition: Form,

meaning, and use. Malden, MA: Blackwell.
Bardovi-Harlig, K. (2002). Analyzing aspect. In R. Salaberry & Y. Shirai (Eds.), The

L1 acquisition of tense-aspect morphology (pp. 129154). Amsterdam: Benjamins.
Bardovi-Harlig, K., & Bergstom, A. (1996). The acquisition of tense and aspect in
SLA and FLL: A study of learner narratives in English (SL) and French (FL).
Canadian Modern Language Review, 52, 308330.
Bardovi-Harlig, K., & Raynolds, D. W. (1995). The role of lexical aspect in the
acquisition of tense and aspect. TESOL Quarterly, 29, 107131.
Benati, A. (2005). The effects of processing instruction, traditional instruction and

meaning-output instruction on the acquisition of the English past simple tense.
Language Teaching Research, 9(1), 6793.
Bitchener, J., Young, S., & Cameron, D. (2005). The effect of different types of
corrective feedback on ESL student writing. Journal of Second Language Writing,
14, 191205.
Bouton, L. F. (1994). Can CCS skill in interpreting implicature in American English

be improved through explicit instruction? A pilot study. Pragmatics and Language
Learning, 5, 89109.
Cadierno, T. (1995). Formal instruction from a processing perspective: An

investigation into the Spanish past tense. Modern Language Journal, 79,
179193.
Carroll, S., Roberge, Y., & Swain, M. (1992). The role of feedback in adult second
language acquisition: Error correction and morphological generalization. Applied
Psycholinguistics, 13, 173189.
292
Spada and Tomita
Carroll, S., & Swain, M. (1993). Explicit and implicit negative feedback: An
empirical study of the learning of linguistic generalizations. Studies in Second
Language Acquisition, 15(3), 357386.
Celce-Murcia, M., & Larsen-Freeman, D. (1999). The grammar book. Boston: Heinle
& Heinle.
Chan, A. Y. W. (2006). An algorithmic approach to error correction: An empirical

study. Foreign Language Annals, 39(1), 131147.
Clachar, A. (2005). Creole English speakers treatment of tense-aspect morphology in

English interlanguage written discourse. Lanugage Learning, 55(2), 275334.
Collins, L. (2002). The role of L1 influence and lexical aspect in the acquisition of
temporal morphology. Language Learning, 52(1), 4394.
Day, E., & Shapson, S. (1991). Integrating formal and functional approaches to
language teaching in French immersion: An experimental study. Language
Learning, 41, 2558.
de Graaff, R. (1997). The eXperanto experiment: Effects of explicit instruction on

second language acquisition. Studies in Second Language Acquisition, 19,
249297.
DeKeyser, R. M. (1995). Learning second language grammar rules: An experiment

with a miniature linguistic system. Studies in Second Language Acquisition, 17,
379410.
DeKeyser, R. M. (1997). Beyond explicit rule learning: Automatizing second

language morphosyntax. Studies in Second Language Acquisition, 19, 195221.
DeKeyser, R. M. (1998). Beyond focus on form: Cognitive perspectives on learning
and practicing second language grammar. In C. Doughty & J. Williams (Eds.),
Focus on form in classroom second language acquisition (pp. 4263). Cambridge:
Cambridge University Press.
DeKeyser, R. M. (2003). Implicit and explicit learning. In C. Doughty & M. Long
(Eds.), The handbook of second language acquisition (pp. 313348). Oxford:
Blackwell.
DeKeyser, R. M. (2005). What makes learning second-language grammar difficult? A
review of issues. Language Learning, 55(Suppl. 1), 125.
DeKeyser, R. M., & Sokalski, K. J. (1996). The differential role of comprehension

and production practice. Language Learning, 46, 613642.
Doughty, C. (1991). Second language instruction does make a difference: Evidence

from an empirical study of SL relativization. Studies in Second Language
Doughty, C. (2003). Instructed SLA: Constraints, compensation, and enhancement. In
C. Doughty & M. Long (Eds.), The handbook of second language acquisition
(pp. 256310). Oxford: Blackwell.
Doughty, C., & Varela, E. (1998). Communicative focus on form. In C. Doughty & J.
Williams (Eds.), Focus on form in classroom second language acquisition (pp.
114138). Cambridge: Cambridge University Press.
293
Spada and Tomita
Doughty, C., & Williams, J. (1998). Pedagogical choices in focus on form. In

C. Doughty & J. Williams (Eds.), Focus on form in classroom second language
acquisition (pp. 197261). Cambridge: Cambridge University Press.
Ellis, R. (1990). Instructed second language acquisition. Oxford: Basil Blackwell.
Ellis, R. (1993). The structural syllabus and second language acquisition. TESOL
Quarterly, 27(1), 91113.
Ellis, R. (2001). Investigating form-focused instruction. Language Learning, 51, 146.
Ellis, R. (2002). Does form-focused instruction affect the acquisition of implicit
knowledge? A review of the research. Studies in Second Language Acquisition, 24,
223236.
Ellis, R. (2005). Measuring implicit and explicit knowledge of a second language: A
psychometric study. Studies in Second Language Acquisition, 27, 14172.
Ellis, R. (2006a). Current issues in the teaching of grammar: An SLA perspective.
TESOL Quarterly, 40(1), 83107.
Ellis, R. (2006b). Modeling learning difficulty and second language proficiency: The
differential contributions of implicit and explicit knowledge. Applied Linguistics,
27(3), 431463.
Ellis, R., Loewen, S., & Erlam, R. (2006). Implicit and explicit corrective feedback
and the acquisition of L2 grammar. Studies in Second Language Acquisition, 28,
339368.
Ellis, R., Rosszell, H., & Takashima, H. (1994). Down the garden path: Another look
at negative feedback. JALT Journal, 16, 924.
Fotos, S., & Ellis, R. (1991). Communicating about grammar: A task-based

approach. TESOL Quarterly, 25(4), 605628.
Goldschneider, J. M., & DeKeyser, R. M. (2005). Explaining the natural order of L2
morpheme acquisition in English: A meta-analysis of multiple determinants.
Language Learning, 55(Suppl. 1), 2777.
Gor, K., & Cherinigovskaya, T. (2005). Formal instruction and the acquisition of
verbal morphology. In A. Housen & M. Pierrard (Eds.), Investigation in instructed
second language acquisition (pp. 131164). New York: Mouton de Gruyter.
Griggs, P. (2005). Assessment of the role of communication tasks in the development

of second language oral production skills. In A. Housen & M. Pierrard (Eds.),
Investigation in instructed second language acquisition (pp. 407432). New York:
Mouton de Gruyter.
Han, Z. (2002). A study of the impact of recasts on tense consistency in L2 output.

Harley, B. (1993). Instructional strategies and SLA in early French immersion.

Studies in Second Language Acquisition, 15, 245259.
Hedges, L. V. (1981). Distribution theory for Glasss estimator of effect size and
related estimators. Journal of Educational Statistics, 6, 107128.
Herron, C., & Tomasello, M. (1988). Learning grammatical structures in foreign

language: Modelling versus feedback. The French Review, 61, 910922.
294
Spada and Tomita
Hinkel, E. (1997). The past tense and temporal verb meanings in a contextual frame.
Housen, A. (2002). The development of tense-aspect in English as a second language

and the variable influence of inherent aspect. In R. Salaberry & Y. Shirai (Eds.),
The L1 acquisition of tense-aspect morphology (pp. 155197). Amsterdam:
Benjamins.
Housen, A., Pierrard, M., & Van Daele, S. (2005). Rule complexity and the efficacy of
explicit grammar instruction. In A. Housen & M. Pierrard (Eds.), Investigations in
instructed second language acquisition (pp. 235269). Amsterdam: Mouton de
Gruyter.
Hsu, L. M. (2002). Fail-safe Ns for one-versus two-tailed tests lead to different
conclusions about publication bias. Understanding Statistics, 1(2), 85100.
Hulstijn, J. H. (1989). Implicit and incidental second language learning:

Experiments in the processing of natural and partly artificial input. In H. W.
Dechert & M. Raupach (Eds.), Interlingual processes (pp. 4973). Tubingen:
Gunter Narr.
Hulstijn, J. H. (1995). Not all grammar rules are equal: Giving grammar instruction its
proper place in foreign language teaching. In R. Schmidt (Ed.), Attention and
awareness in foreign language learning (pp. 359386). Honolulu: University of
Hawaii at Manoa.
Hulstijn, J. H., & de Graaff, R. (1994). Under what conditions does explicit knowledge
of a second language facilitate the acquisition of implicit knowledge? A research
proposal. AILA Review, 11, 97112.
Ionin, T., & Wexler, K. (2003). Why is Is easier than -s? Acquisition of
tense/agreement morphology by child second language learners of English. Second
Language Research, 18(2), 95136.
Izumi, S. (2002). Output, input enhancement, and the noticing hypothesis: An

experimental study on ESL relativization. Studies in Second Language Acquisition,
24, 541577.
Izumi, S., & Bigelow, M. (2000). Does output promote noticing and second language
acquisition? TESOL Quarterly, 34 (2), 239278.
Izumi, S., Bigelow, M., Fujiwara, M., & Fearnow, S. (1999). Testing the output
hypothesis: Effects of output on noticing and second language acquisition. Studies
in Second Language Acquisition, 21, 421452.
Izumi, Y., & Izumi, S. (2004). Investigating the effects of oral output on the learning
of relative clauses in English: Issues in the psycholinguistic requirements for
effective output tasks. Canadian Modern Language Review, 60(5), 587609.
Izumi, S., & Lakshmanan, U. (1998). Learnability, negative evidence and the L2
acquisition of the English passive. Second Language Research, 14(1), 62101.
Jourdenais, R., Ota, M., Stauffer, S., Boyson, B., & Doughty, C. (1995). Does textual
enhancement promote noticing? A think-aloud protocol analysis. In R. Schmidt
(Ed.), Attention and awareness in foreign language learning (Tech. Rep. No. 9;
295
Spada and Tomita
pp. 182216). Honolulu: University of Hawaii, Second Language Teaching &

Curriculum Center.
Keck, C. M., Iberri-Shea, G., Tracy-Ventura, N., & Wa-Mbaleka, S. (2006).
Investigating the empirical link between task-based interaction and acquisition: A
meta-analysis. In J. Norris & L. Ortega (Eds.), Synthesizing research on language
learning and teaching (pp. 91131). Amsterdam: Benjamins.
Kihlstedt, M. (2002). Reference to past events in dialogue: The acquisition of tense

and aspect by advanced learners of English. In R. Salaberry & Y. Shirai (Eds.),
The L1 acquisition of tense-aspect morphology (pp. 323361). Amsterdam:
Benjamins.
Krashen, S. (1982). Principles and practice in second language acquisition. Oxford:
Pergamon.
Krashen, S. (1994). The input hypothesis and its rivals. In N. Ellis (Ed.), Implicit and
explicit learning of languages (pp. 4577). London: Academic Press.
Kubota, M. (1993). Accuracy order and frequency order of relative clauses as used by
Japanese senior high school students of EFL. Institute for Research in Language
Teaching Bulletin, 7, 2753.
Kubota, M. (1994). The role of negative feedback on the acquisition of the English
dative alternation by Japanese college students of EFL. Institute for Research in
Language Teaching Bulletin, 8, 136.
Kubota, M. (1995a). The Garden Path technique: Is it really effective? Working

Papers of Chofu Gakuin Womens Junior College, 27, 2148.
Kubota, M. (1995b). Teachability of conversational implicature to Japanese EFL

learners. Institute for Research in Language Teaching Bulletin, 9, 3567.
Kubota, M. (1996). The effects of instruction plus feedback on Japanese university

students of EFL: A pilot study. Bulletin of Chofu Gakuen Womens Junior College,
18, 5995.
Kuiken, F., & Vedder, I. (2002). The effect of interaction in acquiring the
grammar of a second language. International Journal of Educational Research, 37,
343358.
Lantolf, J. P., DiCamilla, F. J., & Ahmed, M. K. (1997). The cognitive function of
linguistic performance: Tense/aspect use by L1 and L2 speakers. Language
Sciences, 19, 153165.
Larsen-Freeman, D., Kuehn, T., & Hacciuis, M. (2002). Helping students make
appropriate English verb tense-aspect choices. TESOL Journal, 11(4), 39.
Leow, R. P. (1997). Attention, awareness, and foreign language behavior. Language

Learning, 47, 467506.
Leow, R. P. (1998a). The effects of amount and type of exposure on adult learners L2
development in SLA. Modern Language Journal, 82, 4968.
Leow, R. P. (1998b). Toward operationalizing the process of attention in SLA:

Evidence for Tomlin and Willas (1994) fine-grained analysis of attention. Applied
Psycholinguistics, 19, 133159.
296
Spada and Tomita
Lightbown, P. M. (1980). The acquisition and use of questions by French L2 learners.

In S. Felix (Ed.), Second language development: Trends and issues (pp. 151175).
Tubingen: Gunter Narr.
Lipsey, M., & Wilson, D. (2001). Practical meta-analysis. London: Sage.
Long, M. H., Inagaki, S., & Ortega, L. (1998). The rule of implicit negative feedback
in SLA: Models and recasts in Japanese and Spanish. Modern Language Journal,
82, 357371.
Loschky, L. (1994). Comprehensible input and second language acquisition: What is

the relationship? Studies in Second Language Acquisition, 16, 303323.
Lyster, R. (1994). The effect of functional-analytic teaching on aspects of French

immersion students sociolinguistic competence. Applied Linguistics, 15, 263287.
Mackey, A. (1999). Input, interaction, and second language development: An

empirical study on question formation in ESL. Studies in Second Language
Mackey, A. (2006). Feedback, noticing and instructed second language learning.

Applied Linguistics, 27(3), 405430.
Mackey, A., & Goo, J. (2007). Interaction research in SLA: A meta-analysis and
research synthesis. In A. Mackey (Ed.), Conversational interaction in second
language acquisition: A series of empirical studies (pp. 407452). Oxford: Oxford
University Press.
Mackey, A., & Oliver, R. (2002). Interactional feedback and childrens L2

development. System, 30, 459477.
Mackey, A., & Philp, J. (1998). Conversational interaction and second language
development: Recasts, responses, and red herrings? Modern Language Journal,
82(3), 338356.
Master, P. (1994). The effect of systematic instruction on learning the English article
system. In T. Odlin (Ed.), Perspectives on pedagogical grammar (pp. 229252).
Cambridge: Cambridge University Press.
McDonough, K. (2004). Learner-learner interaction during pair and small group

activities in a Thai EFL context. System, 32, 207224.
McDonough, K. (2005). Identifying the impact of negative feedback and learners

responses on ESL question development. Studies in Second Language Acquisition,
27, 79103.
McDonough, K. (2006). Interaction and syntactic priming: English L2 speakers

production of dative constructions. Studies in Second Language Acquisition, 28,
179207.
Meisel, J., Clahsen, H., & Pienemann, M. (1981). On determining developmental
stages in natural second language acquisition. Studies in Second Language
Muranoi, H. (2000). Focus on form through interaction enhancement: Integrating

formal instruction into a communicative task in EFL classrooms. Language
Learning, 50(4), 617673.
297
Spada and Tomita
Murphy, V. A. (2004). Dissociable systems in second language inflectional

morphology. Studies in Second Language Acquisition, 26(3), 433459.
Nagata, N. (1993). Intelligent computer feedback for second language instruction.

Modern Language Journal, 77, 330339.
Nagata, N. (1995). An effective application of natural language processing in second

language instruction. CALICO Journal, 13, 4767.
Nagata, N. (1997a). The effectiveness of computer-assisted metalinguistic instruction:

A case study in Japanese. Foreign Language Annals, 30, 187200.
Nagata, N. (1997b). An experimental comparison of deductive and inductive

feedback generated by a simple parser. System, 25, 515534.
Nagata, N. (1998). Input vs. output practice in educational software for second
language acquisition. Language Learning & Technology, 1(2), 2340.
Nakamori, T. (2002). Teaching relative clauses: How to handle a bitter lemon for
Japanese learners and English teachers. ELT Journal, 56(1), 2940.
Norris, J., & Ortega, L. (2000). Effectiveness of L2 instruction: A research synthesis
and quantitative meta-analysis. Language Learning, 50, 417528.
Norris, J., & Ortega, L. (2006). The value and practice of research synthesis for
language learning. In J. Norris & L. Ortega (Eds.), Synthesizing research on
language learning and teaching (pp. 352). Amsterdam: Benjamins.
Nunan, D. (1994). Linguistic theory and pedagogic practice. In T. Odlin (Ed.),

Perspectives on pedagogical grammar (pp. 253270). Cambridge: Cambridge
University Press.
Orwin, R. G. (1983). A fail-safe N for effect size in meta-analysis. Journal of
Educational Statistics, 8(2), 157159.
Pienemann, M. (1989). Is language teachable? Psycholinguistic experiments and
hypothesis. Applied Linguistics, 10, 217244.
Ravem, R. (1973). Language acquisition in a second language environment. In J. Oller
& J. Richards (Eds.), Focus on the learner (pp. 136144). Rowley, MA: Newbury
House.
Reber, A. S. (1989). Implicit learning and tacit knowledge. Journal of Experimental
Psychology: General, 118, 219235.
Robinson, P. (1996). Learning simple and complex second language rules under
implicit, incidental, rule-search and instructed conditions. Studies in Second
Language Acquisition, 18, 2767.
Robinson, P. (1997a). Generalizability and automaticity of second language learning

under implicit, incidental, enhanced and instructed conditions. Studies in Second
Language Acquisition, 19(2), 233247.
Robinson, P. (1997b). Individual differences and the fundamental similarity of

implicit and explicit adult second language learning. Language Learning, 47,
4599.
Rohde, A. (2002). The aspect hypothesis in naturalistic L2 acquisition: What

uninflected and non-target-like verb forms in early interlanguage tell us. In R.
298
Spada and Tomita
Salaberry & Y. Shirai (Eds.), The L1 acquisition of tense-aspect morphology

(pp. 199220). Amsterdam: Benjamins.
Rosenthal, R. (1979). The file-drawer problem and tolerance for null results.
Psychological Bulletin, 86(3), 638641.
Rosenthal, R. (1991). Meta-analytic procedures for social research. Newbury Park,
CA: Sage.
Rosenthal, R. (1995). Writing meta-analytic reviews. Psychological Bulletin, 118(2),
183192.
Rosenthal, R., Rosnow, R. L., & Rubin, D. B. (2000). Contrasts and effect sizes in
behavioral research: A correlational approach. Cambridge: Cambridge University
Press.
Salaberry, M. R. (1997). The role of input and output practice in second language
acquisition. Canadian Modern Language Review, 53, 422451.
Salaberry, R. (2000). The acquisition of English in an instructional setting. System,

28, 135152.
Salaberry, R., & Shirai, Y. (2002). L2 acquisition of tense-aspect morphology. In R.

Salaberry & Y. Shirai (Eds.), The L1 acquisition of tense-aspect morphology
Schmidt, R. (1990). The role of consciousness in second language learning. Applied
Linguistics, 11, 129158.
Schmidt, R. (1994). Deconstructing consciousness in search of useful definitions for
applied linguistics. AILA Review, 11, 1126.
Schmidt, R. (2001). Attention. In P. Robinson (Ed.), Cognition and second language
instruction (pp. 332). Cambridge: Cambridge University Press.
Schumann, J. (1979). The acquisition of English negation by speakers of Spanish: A
review of the literature. In R. Andersen (Ed.), The acquisition and use of Spanish
and English as first and second languages (pp. 332). Washington, DC: TESOL.
Schwartz, B. (1993). On explicit and negative data effecting and affecting competence
and linguistic behavior. Studies in Second Language Acquisition, 15, 147
162.
Scott, V. (1989). An empirical study of explicit and implicit teacing strategies in

French. Modern Language Journal, 72, 1422.
Scott, V. (1990). Explicit and implicit grammar teaching: New empirical data. The
French Review, 62, 779788.
Slabakova, R., & Montrul, S. (2002). On viewpoint aspect interpretation and its L2
acquisition: A UG perspective. In R. Salaberry & Y. Shirai (Eds.), The L1
acquisition of tense-aspect morphology (pp. 363395). Amsterdam: Benjamins.
Spada, N. (1997). Form-focused instruction and second language acquisition: A review
of classroom and laboratory research. Language Teaching, 30, 7387.
Spada, N., & Lightbown, P. M. (1993). Instruction and the development of questions
in the L2 classroom. Studies in Second Language Acquisition, 15, 205
221.
299
Spada and Tomita
Spada, N., & Lightbown, P. M. (1999). Instruction, first language influence, and
developmental readiness in second language acquisition. Modern Language
Journal, 83(1), 122.
Spada, N., & Lightbown, P. M. (2008). Form-focused instruction: Isolated or
integrated? TESOL Quarterly, 42(2), 181207.
Spada, N., Lightbown, P. M., & White, J. (2005). The importance of form/meaning
mappings in explicit form-focused instruction. In A. Housen & M. Pierrard (Eds.),
Current issues in instructed second language learning (pp. 199234). Brussels:
Mouton de Gruyter.
Spada, N., & Tomita, Y. (2008). The complexities of selecting complex (and simple)
forms in instructed SLA research. In A. Housen & F. Kuiken (Eds.), Proceedings of
the complexity, accuracy and fluency (CAF) conference (pp. 227254). Belgium:
University of Brussels.
Takashima, H., & Ellis, R. (1999). Output enhancement and the acquisition of the
past tense. In R. Ellis (Ed.), Learning a second language through interaction
Tickoo, A. (2002). On the use of then/after that in the marking of chronological

order: Insights from Vietnemese and Chinese learners of ESL. System, 30,
107124.
Truscott, J. (1996). The case against grammar correction in L2 writing classes.
Language Learning, 46, 327369.
Truscott, J. (1999). Whats wrong with oral grammar correction. Canadian Modern
Language Review, 55, 437456.
Vacha-Haase, T., & Thompson, B. (2004). How to estimate and interpret various effect
sizes. Journal of Counseling Psychology, 51(4), 473481.
van Baalen, T. (1983). Giving learners rules: A study into the effect of grammatical
instruction with varying degrees of explicitness. Interlanguage Studies Bulletin
Utrecht, 7, 71100.
VanPatten, B., & Cadierno, T. (1993). Explicit instruction and input processing.
Studies in Second Language Acquisition, 15, 225241.
VanPatten, B., & Oikkenon, S. (1996). Explanation versus structured input in

processing instruction. Studies in Second Language Acquisition, 18, 495
510.
VanPatten, B., & Sanz, C. (1995). From input to output: Processing instruction and
communicative tasks. In F. Eckman, D. Highland, P. Lee, J. Mileham, & R. Weber
(Eds.), SLA theory and pedagogy (pp. 169185). Hillsdale, NJ: Erlbaum.
White, J. (1991). Adverb placement in second language acquisition: Some effects of
positive and negative evidence in the classroom. Second Language Research, 7,
133161.
White, J., & Ranta, L. (2002). Examining the interface between metalinguistic task
performance and oral production in a second language. Language Awareness, 11(4),
259290.
300
Spada and Tomita
White, L., Spada, N., Lightbown, P. M., & Ranta, L. (1991). Input enhancement and
L1 question formation. Applied Linguistics, 12, 416432.
Williams, J., & Evans, J. (1998). What kind of focus and on which forms? In C.
Doughty & J. Williams (Eds.), Focus on form in classroom second language
acquisition (pp. 139155). Cambridge: Cambridge University Press.
Wode, H. (1976). Developmental sequences in naturalistic L2 acquisition. Working
Papers in Bilingualism, 11, 113.
Yang, L., & Givon, T. (1997). Benefits and drawbacks of controlled laboratory studies
of second language acquisition. Studies in Second Language Acquisition, 19,
173194.
Yang, S., & Huang, Y. (2004). The impact of the absence of grammatical tense in L1
on the acquisition of the tense-aspect system in L2. International Review of Applied
Linguistics in Language Teaching, 42(1), 4970.
Yip, Y. (1994). Grammatical consciousness-raising and learnability. In T. Odlin

(Ed.), Perspectives on pedagogical grammar (pp. 123139). Cambridge: Cambridge
University Press.
Zobl, H. (1985). Grammars in search of input and intake. In S. Gass & C. Madden
(Eds.), Input in SLA (pp. 329344). Rowley, MA: Newbury House.
301


Spada et al. (2005)
Master (1994)
Muranoi (2000)
Chan (2006)
Ellis et al. (2006)
Mackey (2006)
Study 1
Study 2
Study 1
Study 2
Study 1
Study 2
Study 3
Study 1
Study 2
Study 3
90
90
61
59
53
53
53
160
34
28
28
28
47
91
91
144
34
47
30

Benati (2005)
Study 1
Study 2
Study
Appendix A: Summary of Synthesized Studies
Classroom
Classroom
Classroom
Classroom
Classroom
Classroom
Classroom
Classroom
Classroom
Classroom
Classroom
Classroom
Classroom
Classroom
Classroom
Classroom
Classroom
Classroom
Classroom
Context
Questions (c)
PDs (s)
Past tense (s)
PDs (s)
Prep (s)
Tense (s)
Articles (s)
Relative (c)
Past tense (s)
Plurals (s)
Past tense (s)
Questions (c)
Articles (s)
Def. Articles (s)
Inf. Articles (s)
Questions (c)
Relativization (c)
Past tense (s)
Past tense (s)
Targeta
(Continued)
E/E/C
E/E/C
I/C
E/C
E/E/C
E/E/C
E/E/C
E
I/E/C
I
I
I
E/C
I/E/C
I/E/C
I
E/E/E/C
E/E/C(B)
E/E/C(B)
Instructionb
Spada and Tomita

302
303
Study 1
Study 2
Study 1
Study 2
Study 1
Study 2
Study 1
Study 2
129
108
33
33
10
100
20
56
34
8
61
24
15
100
34
34
22
35
60
104
104
60
N
Classroom
Classroom
Classroom
Classroom
Classroom
Laboratory
Laboratory
Laboratory
Laboratory
Laboratory
Laboratory
Laboratory
Laboratory
Laboratory
Laboratory
Laboratory
Laboratory
Laboratory
Laboratory
Laboratory
Laboratory
Laboratory
Context
Questions (c)
Questions (c)
Passives (c)
Participial adj(s)
Passives (c)
Datives (c)
Relativization (c)
Datives (c)
Datives (c)
Tense (s)
Relativization (c)
Relativization (c)
Passives (c)
Dative (c)
Passives (c)
Questions (c)
Questions (c)
Questions (c)
Questions (c)
Pseudo-cleft (c)
S-V inversion (s)
Datives (c)
Targeta
Complex and simple forms are represented as (c) and (s), respectively.
Implicit, explicit, control, and comparison (baseline) groups are represented as I, E, C, and C(B), respectively.
Robinson (1997a)
Han (2002)
Izumi (2002)
Kubota (1994)
Mackey (1999)
McDonough (2005)
Robinson (1996, 1997b)
Yip (1994)
Doughty (1991)
White et al. (1991)
Study
Continued
Appendix A
E/C
E/C
I/E/C
I/E/C
E
I/E/E/C
I/E/C(B)
E/E/C
E/E/C
I/C
I/I/I/I/C
I/I/C
E/C
E/E/I/I/C
I/C
I/I/I/I/C
I/C
I/I/I/I/C
I/I/I/C
I/E/E/C(B)
I/E/E/C(B)
I/E/E/C(B)
Instructionb
Spada and Tomita



Spada et al. (2005)
Master (1994)
Muranoi (2000)
Chan (2006)
Ellis et al. (2006)
Mackey (2006)

Benati (2005)
Study
Study 1
Study 2
Study 1
Study 2
Study 1
Study 2
Study 3
Study 1
Study 2
Study 3
Study 1
Study 2
6
6
3 lessons
3
1
2.5
2.5
2.5
6
1.5
1.5
8
0.33
0.33
0.33
1.5
6
6
Lengthb
(hr)
In a week
In a week
7
7
4 weeks
4 weeks
4 weeks
In a week
1
In a week
In a week
In a week
In a week
In a week
In a week
3
1
0
0
Immediate
(day)
Appendix B: Length of Treatmenta and Timing of Posttests
5
5
16
4
2
First delayed
(week)
(Continued)
Second delayed
(week)
Spada and Tomita

304
305
Study 1
Study 2
Study 1
Study 2
Study 1
Study 2
Study 1
Study 2
5
9
8
8
0.75
2 lessons
1.67
0.5
0.5
8 lessons
4.5
1.25
3
3 lessons
1.5
1
1.5
1
0.5
0.17
0.17
0.42
Lengthb
(hr)
On the first day
On the first day
2 weeks
2 weeks
14
0
0
0
0
5
3.43
0
5
0
0
1
1
1
In a week
0
0
0
Immediate
(day)
1
1
4
8
4
2
2
2
4
First delayed
(week)
3
3
7
Second delayed
(week)
Some studies reported the length by the number of lessons. We considered each lesson to be 1 hr.
The mean treatment lengths were 2.97 hr for explicit-complex, 3.09 hr for explicit-simple, 2.47 hr for implicit-complex, and 3.13 hr for
implicit-simple groups.
Robinson (1997a)
Han (2002)
Izumi (2002)
Kubota (1994)
Mackey (1999)
McDonough (2005)
Robinson (1996, 1997b)
Yip (1994)
Doughty (1991)
White et al. (1991)
Study
Continued
Appendix B
Spada and Tomita
Spada and Tomita
Appendix C: Learner and Study Characteristics

Explicitcomplex
na
kb
Explicitsimple
n
k
Implicitcomplex
n
k
Proficiency
Low
Mid
High
Mix
12 years
5 years and more
Not reported
3
5
1
1
1
2
2
4
10
1
1
2
4
2
0
5
0
1
1
0
6
0
9
0
1
2
0
8
1
6
0
2
0
4
2
1
11
0
8
0
7
2
0
5
0
0
0
1
3
0
5
0
0
0
1
3
Educational context
College
Intensive ESL
Adult, not college
Secondary
Elementary
Child ESL
5
4
2
2
3
0
9
4
4
4
4
0
3
1
5
2
2
0
3
1
9
4
3
0
3
5
4
1
1
1
6
9
11
1
1
1
3
4
2
0
0
0
3
4
2
0
0
0
10
5
13
11
9
4
14
6
11
4
22
7
6
3
6
3
10
3
2
17
5
2
10
3
0
14
6
0
10
3
2
24
3
2
6
1
2
6
1
2
7
8
10
14
12
1
18
2
3
12
3
26
7
2
7
2
ESL/EFLc
ESL
EFL
Study design
Control
Comparison
No control/
comparison
Study context
Class
Lab
Implicitsimple
n
k
Number of sample studies (e.g., a study report may include multiple sample studies)
contributing to the meta-analysis.
b
c
More than 20 first languages are reported in the original publications.
Appendix D: Effect Size Calculations

Cohens d
1. If the original study provided means and standard deviations, or if the original study provided sample sizes and all participants raw scores, Cohens
306
Spada and Tomita
d was calculated as in (a), where M stands for the mean, SQRT refers to the
squared root, and SD is the standard deviation (adapted from Vacha-Haase
& Thompson, 2004, p. 474):
d = {M1 M2 }/{SQRT[(SD1 SD1 + SD2 SD2 )/2]}.
(a)
2. If the original study reported F values from a one-way ANOVA and sample
sizes, Cohens d was computed as in (b), where n is a sample size (adapted
from Keck et al., 2006, p.106):
d = SQRT{F(n 1 + n 2 )/(n 1 n 2 )}.
(b)
3. If the original study provided t values of t test and sample sizes, Cohens d
was computed as in (c) (adapted from Rosenthal, 1991, pp. 1718):
d = 2t/[SQRT(d f )].
(c)
The assumption in (c) is equal sample sizes in the two compared groups.
There is a different calculation for computing effect sizes by comparing two
groups with different sample sizes (see Rosenthal, 1991, for a discussion).
In this meta-analysis, we used (c) to calculate the effect sizes for only
one study. Although the sample sizes of the two compared groups were
not always the same, the difference in the sample sizes was small (n 2
n 1 = 2).
4. If the original study reported only percentage of participants who experienced improvement, Cohens d was calculated based on arcsine transformations in (d) (adapted from Keck et al., 2006, p. 106; Lipsey & Wilson,
2001, pp. 188, 204):
d = arcsinetreatment arcsinecontrol .
(d)
Unbiased and Weighted Effect Size Calculation

Hedges (1981) provided a formula for calculating effect sizes as in (e), where
d is an unbiased effect size, N is the total sample size, and d is the biased or
raw effect size (adapted from Hedges, 1981, p. 114; Lipsey & Wilson, 2001,
p. 49):
d = [1 3/(4N 9)] d.
307
(e)
Spada and Tomita
In order to compute the weighted effect size, we calculated the inverse variance
weight as in (f) (adapted from Lipsey & Wilson, 2001, p. 49):
w = 2n 1 n 2 (n 1 + n 2 )

2(n 1 + n 2 )2 + n 1 n 2 d 2 ,
(f)
where w is the inverse variance weight. Finally, the weighted mean effect size,
the standard error of the mean effect size (SE), and lower and upper 95%
confidence intervals (CI) were calculated as in (g), (h), and (i), respectively
(adapted from Lipsey & Wilson, 2001):
Weighted mean effect size d =
wd

w,
(g)

SE = SQRT 1/
w ,
(h)
CI = weighted mean effect size d 1.96SE.
(i)
308

Spada Et Al-2010-Language Learning

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Spada Et Al-2010-Language Learning

Uploaded by

Copyright:

Available Formats

Language Learning

In the instructed second language acquisition (SLA) literature there is a general

Language Learning 60:2, June 2010, pp. 263308

Spada and Tomita

Type of Instruction and Language Feature

Spada and Tomita

Type of Instruction and Language Feature

directly investigated the effects of instruction on simple and complex language

Language Learning 60:2, June 2010, pp. 263308

Spada and Tomita

Type of Instruction and Language Feature

Spada and Tomita

Type of Instruction and Language Feature

at earlier stages (Meisel, Clahsen, & Pienemann, 1981). Grammatical difficulty

Language Learning 60:2, June 2010, pp. 263308

Spada and Tomita

Type of Instruction and Language Feature

Spada and Tomita

Type of Instruction and Language Feature

simple present -s marker in English as a formally simple structure

Language Learning 60:2, June 2010, pp. 263308

Spada and Tomita

Type of Instruction and Language Feature

Spada and Tomita

Type of Instruction and Language Feature

Table 1 Number of study reports and publication years

Language Learning 60:2, June 2010, pp. 263308

Spada and Tomita

Type of Instruction and Language Feature

Table 2 Number of transformations: Complex and simple rules

instructional contexts, language features, and instructional approaches in the

Spada and Tomita

Type of Instruction and Language Feature

Table 3 Simple and complex features

Coding for Instruction

Language Learning 60:2, June 2010, pp. 263308

Spada and Tomita

Type of Instruction and Language Feature

and (d) free constructed responses. Metalinguistic judgments require learners to

Language Learning 60:2, June 2010, pp. 263308

Spada and Tomita

Type of Instruction and Language Feature

Language Learning 60:2, June 2010, pp. 263308

Spada and Tomita

Type of Instruction and Language Feature

In conducting meta-analyses, the standardized mean difference effect size

Spada and Tomita

Type of Instruction and Language Feature

Table 5 Effect sizes of individual treatment groups

Bitchener et al. (2005)

Spada & Lightbown (1999)

Language Learning 60:2, June 2010, pp. 263308

Spada and Tomita

Type of Instruction and Language Feature

Williams & Evans (1998)

Izumi & Izumi (2004)

Kuiken & Vedder (2002)

Mackey & Oliver (2002)

Language Learning 60:2, June 2010, pp. 263308

Spada and Tomita

Type of Instruction and Language Feature

Interactor Ready (I)