Professional Documents
Culture Documents
discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/311965706
CITATIONS READS
0 141
1 author:
Sanako Mitsugi
University of Kansas
8 PUBLICATIONS 17 CITATIONS
SEE PROFILE
All content following this page was uploaded by Sanako Mitsugi on 14 February 2018.
SANAKO MITSUGI
University of Kansas
* This work was supported by the University of Kansas General Research Fund allocation
2302332-099 to Sanako Mitsugi. I thank Philip Kroh and Leslie Montes for experimental
preparation and data collection. I would like to thank Alison Gabriele, Theres Grüter, Brian
MacWhinney, Utako Minai, and the three anonymous reviewers for their valuable comments on
earlier versions of this paper. Any errors are, of course, my own.
2
Psycholinguistic research has shown that sentence processing is incremental (e.g., Altmann &
Kamide, 1999). In Japanese, a verb-final language, native speakers use case markers to
upcoming linguistic items. This study examined whether second-language learners of Japanese,
guided by case markers, generate predictions as to whether the upcoming verb involves the
active or passive voice. The results show that the native speakers made predictive eye movements
before the verb, but the learners did not; the learners were less efficient in using case-marker
cues than the native speakers and relied more on verb morphology information. These results
suggest that case markers guide thematic role assignments, expediting the processing for
Japanese native speakers. Learners may depend more on information from the verb to
In soccer games, a goalkeeper stops shots successfully because he is able to estimate the ball’s
trajectory and a future point at which to intercept it. Similarly, in language comprehension, we
can often guess what comes next in a sentence prior to the actual language input (Kimball, 1975),
and this ability to generate a prediction on how a sentence will continue makes our
research on adults provides substantial evidence for the prevalence of predictive processing
(DeLong, Urbach & Kutas, 2005; Federmeier, 2007; Lau, Stroud, Plesch & Phillips, 2006).
To the degree that second-language (L2) learners also engage in real-time comprehension,
the same mechanism may be operative. Accordingly, generating predictions should also result in
efficient and reliable processing of the L2. However, there are mixed findings for or against how
well L2 learners are able to generate predictions (Dowens, Vergara, Barber & Carreiras, 2010;
Dussias, Valdés Kroff, Guzzardo Tamargo & Gerfen, 2013; Foote, 2010; Grüter, Lew-Williams
& Fernald, 2012; Hopp, 2013; Keating, 2009; Martin et al., 2013). As Kaan (2014) pointed out,
the conditions under which L2 learners are or are not successful in achieving predictive
processing remain to be determined. In addressing this issue, the present study considered
incremental processing in L2 Japanese. The Japanese language provides a strong test case for
studies exploring how L2 learners engage in processing is small, albeit growing (Mitsugi &
In Japanese, all arguments appear before the verb. Nevertheless, Japanese comprehenders
process linguistic input incrementally without a delay before encountering the verb.
Postpositional case markers, in principle, serve as cues for thematic role assignments and, as
such, guide incremental processing for native speakers (Aoshima, Phillips & Weinberg, 2004;
4
Kamide, Altmann & Haywood, 2003; Miyamoto, 2002; Yamashita, 1997). However, Japanese
case markers are known to be difficult for L2 learners to master. Studies have shown that L2
that case markers present (Iwasaki, 2008; Koda, 1993; Sasaki, 1991, 1994). This deficit could
impede the L2 learners’ use of case markers during real-time comprehension (Mitsugi &
MacWhinney, 2010).
The present study pursued this line of research by examining the use of case-marker cues
to incrementally assign thematic roles and predictively activate the representation for which verb
issue addressed in this study. Japanese case markers do not always provide definite information
for role assignments until the sentence-final verb arrives. Passive structure—the structure under
investigation in this study—is a case in point. The canonical order of Japanese passives is a
sequence of a noun in the nominative case followed by a noun in the dative case, such as that
shown in example (1). At the time when Mary-ga is heard, it is initially assigned an agent role,
but when the predicate tatak-are-ta is processed, the thematic role must be reanalyzed from the
agent role to a patient role, so that the sentence is interpreted as having passive voice. In an
active sentence, such as example (2), on the other hand, an initially assigned agent role on John-
(1) Passive1
(2) Active
Because sentence processing is probabilistic (Gibson & Pearlmutter, 1998; Hale, 2001;
Jurafsky, 1996; Levy, 2008), the processor can use accruing linguistic and nonlinguistic
information to generate predictions about the outcome of thematic role assignments, even before
encountering the predicate tatak-are-ta in (1). Such predictions will be more refined when the
comprehenders are also provided with visuocontextual evidence. Even though the interpretation
of the sentential subject provides limited information regarding the upcoming noun and predicate,
as soon as the second noun is heard, the likely thematic relationships among the referents are
more constrained. With a visual scene depicting an act of hitting, a sequence of two nouns,
marked –ga (nominative) and –ni (dative), signals that an upcoming verb, tataku, will involve
In order to assess the time course of these processes, the study used the visual-world eye-
tracking paradigm (Tanenhaus, Spivey-Knowlton, Eberhard & Sedivy, 1995). Because eye-
tracking measures allow us to time lock each eye movement to a corresponding segment in the
auditory input, they provide time-sensitive information about how different pieces of linguistic
information are integrated over time (Tanenhaus, 2007). The results presented here indicate that
L2 speakers were less efficient in using the case-marking cues that native speakers use and that
the learners used verb morphology information in a nativelike manner to compensate for the
Adult psycholinguistic research has shown that native speakers not only integrate each word into
a syntactic structure incrementally but also activate structural representations before they appear
in the linguistic input (e.g., Altmann & Kamide, 1999). This predictive behavior has been shown
to be an integral component of language processing that makes comprehension fast and efficient
(Federmeier, 2007; Lau et al., 2006; Staub & Clifton, 2006). A critical issue in L2 processing
research is whether the same processing strategies that have been identified in native speakers of
a target language are also found in L2 learners’ processing (Grüter, Rohde & Schafer, 2014,
One of the areas that have been extensively studied in L2 processing research is gender
agreement during lexical and syntactic processing. In languages such as French and Spanish,
grammatical gender is marked on the determiner, which precedes the noun, and the determiner
serves as a predictive cue for the upcoming noun. For instance, Dahan, Swingley, Tanenhaus,
and Magnuson (2000) showed that, in the absence of grammatical gender, French native
speakers looked at pictures with names that shared an initial sound with the target (e.g., vase
‘vase’ and vache ‘cow’) more than at pictures with phonologically unrelated names,
demonstrating cohort effects. However, such effects disappeared when the nouns were preceded
less efficient in using gender-marked determiners for generating predictions (Guillelmon &
Rohde, and Schafer (2014) proposed that learners have a reduced ability to generate expectations,
7
which they call the RAGE hypothesis (see also Grüter et al., 2016). The authors claim that the
which does not have enough resources for updating predictions after dealing with the
immediately required processes. Exceptions to this pattern have been observed, such that L2
learners demonstrated nativelike predictive processing (Dowens et al., 2010; Dussias et al., 2013;
Foote, 2010; Hopp, 2013; Keating, 2009). To resolve the discrepancy, Kaan (2014) proposed an
individual difference account, in which the potential differences between native and nonnative
predictive processing lie in the same factors that drive individual differences in native speakers;
in particular, Kaan considered factors such as stored differences in frequency information and the
accuracy and consistency of lexical representation. Frequency information bias moderates the
outcomes of predictions, because predictions are generated on the basis of the likelihood of a
certain word and syntactic frame occurring in the context (MacDonald, Pearlmutter &
Seidenberg, 1994).
The effect of frequency information can help us account for positive evidence offered by
some studies that demonstrate nativelike predictive processing in only advanced L2 learners.
Nevertheless, typical L2 learners receive language input that differs from that of native speakers
in its nature and quantity. The logic is that some L2 learners accumulate the relevant frequency
information, and therefore, their manner of predictive processing approximates that of native
speakers. One convincing case comes from studies on verb subcategorization bias (Dussias &
Cramer Scaltz, 2008; Dussias, Marful, Gerfen & Molina, 2010; Frenck-Mestre & Pynte, 1997).
Lee, Lu, and Garnsey (2013) examined how learners of English with an L1 of Korean used verb
information when they read sentences such as (3) and (4). These sentences were presented with
(3) The ticket agent admitted (that) the mistake might be hard to correct.
(4) The club members understood (that) the bylaws would be applied to everyone.
It was hypothesized that L2 learners experience garden-path effects at the auxiliary would in (4)
with understood, but there should not be garden path effects at might in (3) with admit. This is
because the verb admit has a strong bias for a sentential complement, whereas the verb
understand takes a direct object more frequently than a sentential complement, which leads
speakers to anticipate the direct-object frame (Gahl, Jurafsky & Roland, 2004; Garnsey,
Pearlmutter, Myers & Lotocky, 1997). Lee and colleagues (2013) found that when reading (4),
both native speakers and L2 learners with advanced proficiency demonstrated a slowdown in
reading time at the auxiliary, regardless of the presence of that. However, lower-proficiency
learners processed the auxiliary faster for verbs with sentential complement bias than for
optimally transitive verbs but only with the presence of that. This pattern of results confirms that
the L2 participants used the likelihood of a verb occurring with a direct object or with a sentence
additional cue—that is, the presence of that—to use verb frequency information for real-time
words and weaker associative connections (Chen, 1990; Dufour & Kroll, 1993), and therefore,
English. Instead, case markers are known to guide incremental processing, allowing native
German (Bader & Meng, 1999; Fanselow, Kliegl & Schlesewsky, 1999). This difference in the
types of cues used in predictive processing naturally leads us to question whether L2 learners are
able to use case marker cues for processing. Evidence from L2 processing studies in German and
Dutch demonstrated that learners may have difficulty integrating case and gender assignment
information (Havik, Roberts, Van Hout, Schreuder & Haverkort, 2009; Hopp, 2006; Jackson,
learners do not demonstrate a preference for a subject-first order to the same extent as native
speakers do; the subject preference was only observed for simple past-tense sentences, such as
(5), in which the verb is present early in the sentence, and not in those such as (7). However,
German native speakers exhibited a processing preference for subject-first sentences, regardless
Nachmittag im Café?
Nachmittag im Café?
Nachmittag getroffen?
afternoon met?
Subsequent studies have shown that highly advanced L2 learners can demonstrate nativelike
strategies when they process sentences in a larger discourse context (Hopp, 2009) or sentences in
which the initial wh-word is unambiguously marked with wer ‘who’ or wen ‘whom’ (Jackson &
Dissias, 2009). These results collectively suggest that highly proficient L2 learners may be able
to acquire nativelike predictive processing in head-final languages too, but such processing
strategies may be susceptible to a degree of ambiguity in the predictive cues and to the
The Japanese case system does not have dichotomous rules—as in the assignment of
either nominative or accusative case to Welche in (5). The Japanese nominative case, –ga, may
correspond—before the verbal disambiguation—to the agent, the logical object (a theme or
patient), or the logical object of a predicate adjective. Japanese therefore presents even greater
ambiguity of case-marker cues than Germanic languages do. Despite this ambiguity, previous
psycholinguistic research has shown that Japanese native speakers begin projecting the structure
of an upcoming sentence prior to the verb and that case markers are one of the major
determinants of that process (Aoshima et al., 2004; Kamide et al., 2003; Miyamoto, 2002;
11
Nakano, Felser & Clahsen, 2002; Yamashita, 1997). Adapting case-marker-driven processing
may be a challenging task for L2 learners. The Japanese case system is known to be one structure
that is difficult to master by L2 learners (Iwasaki, 2008). Studies in the competition model have
shown that L2 learners of Japanese demonstrate a reliance on word-order cues, failing to employ
information from case markers for comprehension (Kilborn & Ito, 1989; Rounds & Kanagy,
1998; Sasaki, 1991, 1994; see Sasaki & MacWhinney, 2006, for a review).
MacWhinney (2016) replicated Experiment 3 of Kamide, Altmann, and Hayward (2003) with L2
learners with L1 of English. In line with the results obtained in the original study (Kamide et al.,
2003), Mitsugi and MacWhinney (2016) demonstrated that Japanese native speakers showed
anticipatory eye movements, looking at the theme object in the visual scene after hearing the
marked noun and an accusative-marked noun signals no theme object to follow—as in (9)—
before the point at which the theme was mentioned in the input.
On the other hand, the L2 learners did not generate predictions. L2 learners regulated their
attention to the theme object more in the ditransive condition than in the monotransitive
condition only after the target words were mentioned. Mitsugi and MacWhinney (2016)
accounted for the absence of predictions with the complexity of cues. That is, the process
involved anticipation of the thematic role and lexical semantics based on the combined
information of two preverbal nouns, which is more complicated than assigning a thematic role to
languages. Clearly, this interpretation remains speculative, and further testing of different
MacWhinney (2016), this study examined whether case markers are efficiently used by L2
learners of Japanese by looking at a different type of linguistic outcome prediction: active and
passive voice. In the next section, we lay out the key features of the structure in question.
Japanese actives and passives have been considered a binary opposition in grammatical voice
(Shibatani, 1988). The shift to the passive voice from the active is expressed as a part of the
inflection on the verb, which appears in sentence-final position. Examples (10) and (11) illustrate
an active sentence and its passive counterpart, respectively. Generative approaches see the
passive formation as a process of syntactic derivation from the active counterpart (Chomsky
1995), and the process of the derivation of a passive sentence from an active sentence is as
follows: The direct objet noun, which is originally marked with the accusative case –o, is
promoted to the sentence-initial subject position. It is now marked with the nominative case affix
13
–ga in the passive sentence shown in (11). The initial subject, in turn, is demoted, and it is
marked with the dative case marker –ni. Finally, the passive morpheme –(r)are attaches to the
(11) Active
(12) Passive
In terms of the acquisition, it has been reported that Japanese-speaking children have
around age five or six (Sano, Endo & Yamakoshi, 2001). Possible sources of this difficulty have
been explored; some researchers have argued that the structural complexities of passives (i.e., the
A-chain maturation hypothesis; Borer & Wexler, 1987) contribute to this comprehension
difficulty (Minai, 2000; Sugisaki, 1999). Others posit that the difficulty resides in the assignment
of thematic roles in an atypical order (i.e., patient before agent). Hakuta (1982) showed that
Japanese children misinterpret a first noun phrase of the sentence as the thematic agent when it is
marked in the nominative case, drawing a parallel with English-speaking children, who use the
canonical-order strategy (Slobin & Bever, 1982). Similarly, Clancy (1985) demonstrated that
14
Japanese children acquired scrambled object–subject–verb passives earlier than they did subject–
object–verb passives. In recent eye-tracking studies, Choi and Trueswell (2010) attributed the
children’s disproportionate reliance on cues to the order in which those cues become available
(see Huang, Zheng, Meng & Snedeker, 2013, for a study on children processing Chinese
passives). Children were able to, like adults, incrementally build structural representations as
soon as they had access to the relevant evidence, such as case markers in Korean (which has a
case system similar to that of Japanese). However, when this initial structural commitment
turned out to be incorrect, the children did not appear to revise their initial interpretation and
studies on Japanese passives have been rather descriptive in nature, focused on error analysis and
narrative analysis (Mizutani 1985; Otsuka 1989; Sato 1997). These studies showed that the
passive construction poses a great challenge for L2 learners of Japanese too, which is indexed by
the delayed and scarce use of the structure by those speakers. For instance, Mizutani (1985)
pointed out that learners of Japanese with L1 English had a tendency to use the active voice in
contexts in which the passive voice would be more appropriate. Similarly, examining oral
narrative samples, Watanabe (1996) found evidence that learners did not effectively handle
perspectives, frequently and inconsistently changing the point of view in successive clauses,
whereas native speakers told a story consistently from the main character’s viewpoint. On the
basis of such production patterns, Tanaka (1998) argued that difficulties with the use of Japanese
passives stem from the relationship between structures and perspectives, in which Japanese
speakers are required to use different syntactic structures, such as actives, direct passives,
15
indirect passives, and benefactives, depending on the point of view in compound sentences.
Although these L2 studies can inform us about the pattern of learners’ errors and their
relationship with surrounding structures, it eschews the significance of how, exactly, L2 learners
process Japanese passives in real time. With this study, we take steps toward closing this gap.
The present study examined the extent to which L2 learners incrementally assign thematic roles
to preverbal nouns and predictively activate a structural representation of the upcoming verb
voice before encountering the verb. This study further considered the process in which verb
active and passive sentences, as in examples (13) and (14), were compared. We used the visual-
world paradigm, with the two-choice picture-identification method, in which one picture
matched the active reading, as in (13), and the other matched the passive reading, as in (14).
The first research question posed in the current study is whether L2 learners of Japanese
incrementally assign thematic roles when hearing preverbal nouns. Active sentences involve a
preverbal sequence of a noun in the nominative case and a noun in the accusative case, as in (13),
and passive sentences have a sequence of a noun in the nominative case and a noun in the dative
case, as in (14). We hypothesized that if the participants effectively use the information provided
by these case-marked nouns, they should predict a verb with active voice after hearing a
sequence of a noun in the nominative case and a noun in the accusative case, and the participants
should regulate their anticipatory eye movements toward the image depicting the active
reading—the left image in Figure 1. After hearing a sequence of a noun in the nominative case
and a noun in the dative case, they should anticipate that the upcoming verb would involve
passive morphology and should regulate their anticipatory eye movements toward the image
depicting the passive reading—the right image in Figure 1. Notice that the visual scene provides
the meaning of the verb, and therefore, the participants should not anticipate the verb turning out
to be one of the rare monotransitive verbs that take an a noun marked in the dative case as its
theme (e.g., au ‘meet’) or to be a construction in which the dative noun is interpreted as a goal
after a ditransitive verb (e.g., watasu ‘pass’).3 If, however, L2 learners do not incrementally
assign thematic roles, they will initially extract identical information from the preverbal nouns.
Consequently, there should be no difference in their attention to the pictures in the active and
passive conditions.
17
The second question is how the information from the disambiguating verb feeds into
comprehension. Evidence from previous psycholinguistic studies with native Japanese speakers
suggests that when the initial assignment of thematic roles turns out to be wrong at a later point,
native speakers revise the interpretation (Hirotani, Makuuchi, Rüschemeyer & Friederici, 2011;
Mazuka & Itoh, 1995). In the passive condition, L2 learners’ eye movements to the picture
describing the passive action are to increase over the course of processing the disambiguating
verb when they correctly predict verb voice. However, if the participants analyze the preverbal
fragment as a part of an active sentence in the passive condition, upon arriving at the passive
morphology, -rare, their initial assignment will need revision for the sentence to be correctly
Participants
background took part in this study. They all received US$10 for their participation. The L2-
learner participants were undergraduate students at a North American university, and the native-
speaker participants were also recruited from this university’s community. The learners were
enrolled in forth- to eighth-semester Japanese courses at the time of data collection. The L2
learners’ exposure to the Japanese language was primarily through formal classroom instruction,
but several L2 participants had studied abroad. Participants’ biographical information was
assessed by a language learning history questionnaire, which was administered after the eye-
tracking task.
Because the L2 learners were recruited from various course levels, it was necessary to
assess their proficiency levels and to take them into account when examining the learners’
18
processing. We used part I of the Japanese Skill Test (Itomitsu, 1996). This test is a criterion-
reference test and is reported to be effective and reliable in determining Japanese learners’
proficiency (Eda, Itomitsu & Noda, 2008). The test consists of 45 multiple-choice items
and sentence structure rules. The L2 participants took this proficiency test on a different day,
after completing the eye-tracking session. Table 1 provides the biographic information and the
proficiency test results obtained from the 20 L2 participants whose data points were included in
The L2 learners also completed a translation task in which they translated Japanese
sentences to English. The translation task was administered after the eye-tracking task so as not
to influence the results of the main part of the study. This task was used to make sure that the L2
learners had the relevant grammatical knowledge. The task covered but was not limited to active
and passive structures; it also included sentences with causatives, causative-passives and relative
clauses, all of which required the participants to pay attention to thematic role assignments.
There were 16 items that directly tested the translation of active and passive structures. A
dichotomous scoring method was used, giving one point for each correct thematic role pair (i.e.,
agent–patient) and verb voice. The results of the translation task showed that the L2 learners did
have the relevant grammatical knowledge (mean [M] = 31.2 points, 97%, standard deviation [SD]
= 0.98). For the translation task, the cutoff point was set at 91%, determined by selecting the
point that was two SDs away from the mean. All of the participants were carried forward for
analysis.4
19
Materials
Twelve experimental sentence pairs were constructed on the basis of the two patterns
illustrated in (13) and (14). The active sentences involved a preverbal sequence of a nominative-
marked noun and an accusative-marked noun, followed by a transitive verb with active voice,
and the passive sentences had a sequence of a nominative-marked noun and a dative-marked
noun, followed by a transitive verb with the passive morphology -rare. A modal adverbial phrase
(e.g., ‘badly,’ ‘immediately,’ and ‘seriously’) was added between the second noun and the
sentence-final verb. The hearsay morpheme (i.e., the reported evidential) was added in order to
manipulate the experimental stimuli so that the sentences would not end with a verb root. The
vocabulary items were drawn from the textbooks used in the first- to third-semester Japanese
courses at the institution where the study was conducted in order to ensure lexical familiarity.
Two female native speakers recorded experimental sentences at a natural speaking rate.
The sentences were produced using standard intonation. The recording was done with a sampling
rate of 44,100 hertz. To analyze eye movements with reference to the corresponding spoken
sentences, we placed markers on the recorded stimuli. We marked the onset of the first noun and
the offset of the second noun (in the active condition, onnnanohito-ga otokonohito-o ‘woman-
DAT’) and the onset and offset of the adverbial phrase (hidoku ‘badly’) and the sentence-final
verb phrase (in the active condition, tataitasoudesu ‘hit-PAST-HSY-COP’; in the passive
millisecond (ms) slice of silence was added between the offset of the second noun and the onset
20
of the adverbial phrase, and a 500-ms slice was added between the offset of the adverbial phrase
and the onset of the verb phrase to control the duration of the critical region.
participants’ looks to the active picture scenes during the audio presentation of the adverbial
phrase. Hereafter, this period will be called the critical region. The duration of the critical region
was 1200 ms, which approximately corresponds to the preadverbial slice of silence (200 ms), the
mean duration of the adverbial phrase period (500 ms), and the postadverbial silence (500 ms).
In order to take into account of the time spent on planning and regulating eye movements, the
period of interest was offset by 200 ms (Matin, Shao & Boff, 1993). The statistical analysis was
performed on the critical region, which was from 200 ms after the offset of the second noun and
which lasted 1200 ms. Table 2 reports the mean durations of these sentence regions.
For each verb, two line drawings were created (see Figure 1). These images were drawn
by hand, then scanned and edited for stimulus presentation. One drawing depicted the active
reading (i.e., ‘the woman hit the man’), and the other drawing reversed the agent and patient
roles (i.e., ‘the woman was hit by the man’). For each condition, 12 visual scenes were created.
In order to ensure the reliability of these drawings, two native speakers of Japanese and one
native speaker of English with advanced Japanese proficiency who did not participate the main
experiments, were asked to match the experimental sentences with the corresponding drawings.
There was no disagreement or confusion in the matching procedure. These images were
The prerecorded spoken sentences were randomly assigned to the two experimental lists,
such that the active and passive readings of the sentences each appeared on one of the lists, and
21
each list contained an equal number of active and passive sentences. Each experimental list
comprised 12 items, with 6 items per experimental condition, and these experimental sentences
were combined with 36 filler sentences. The filler sentences had lengths and complexity similar
to those of the experimental sentences.5 The picture scenes and recorded spoken sentences were
presented in a random order determined by the stimulus presentation system associated with the
eye-tracking system. The positions of the active and passive picture scenes were counterbalanced.
The participants’ eye movements were recorded with an EyeLink 1000 desktop-mounted
tracking device manufactured by SR Research, with a monocular sampling rate of 1000 hertz.
The tracking device was connected to a PC that controlled the stimulus display and data storage.
The visual stimuli were presented on a color 22-inch ViewSonic monitor. The participants were
seated approximately 65 centimeters from the monitor and rested their chins comfortably on a
chin rest. The participants’ eye positions were calibrated at the beginning of the experimental
session and whenever it was necessary thereafter, using a nine-point calibration procedure for
Procedure
The participants performed the eye-tracking task individually. They were instructed to determine
which picture scene would match the sentence that they heard and to click the picture with the
mouse. No feedback was given to their response. At the beginning of each trial, the participants
viewed a fixation cross at the center of the screen. After 1500 ms, the fixation cross
automatically disappeared, and then a trial started. A visual scene with two picture objects
appeared on the screen, and a lead sentence (i.e., doshitandesuka? ‘What happened?’) was
played. The onset of the visual scene display was matched to the onset of the lead sentence. After
22
hearing the lead sentence, the participants heard a target sentence, in either active or passive
voice. This orally presented sentence matched one of the picture objects in the visual scene. The
visual scene disappeared 2000 ms after the offset of the sentence-final verb. Five practice
sentences preceded the experimental items. The eye-tracking session took approximately 25
minutes to complete.
Analysis
Data treatments
We calculated response accuracy on the eye-tracking task (i.e., mouse clicks on the correct
picture scenes). We excluded data from any participants whose responses were incorrect on more
than one-third of the items in the eye-tracking study in order to ensure the reliability of their eye
movement patterns. The data from all of the native Japanese participants were carried forward
for analysis (response accuracy for actives, M = 1.00, SD = 0; for passives, M = 0.96, SD = 0.07,
respectively). The data from 9 L2-learner participants were removed because of their low
eighth-semester Japanese courses. The remaining data, from 20 native speakers of Japanese (8
males and 12 females) and 20 L2 learners of Japanese (10 males and 10 females), were
submitted to analysis (response accuracy for actives, M = 0.91, SD = 0.10; for passives, M = 0.76,
SD = 0.21, respectively).
The accuracy data from the native-speaker and the L2-learner groups were analyzed
using logit mixed models (Jaeger, 2008). We first compared the native-speaker and the L2-
learner groups in a single model. The model included the fixed effects condition and group.6 The
random-effect structure included random slopes for participant and for item, together with a by-
23
participant random slope for condition. We used treatment coding for both fixed effects: for
condition, with the active condition as the baseline, and for group, with the native-speaker group
as the baseline. There was a significant effect of condition (estimate = –1.43, Wald Z = –2.52, p
= .011), indicating that the native speaker group performed less accurately on passive sentences
than on active sentences. There was also a significant effect of group (estimate = –2.50, Wald Z
= –4.43, p < .001), suggesting that the L2 learners were less accurate than the native speakers in
the active condition. In addition, we separately analyzed the L2-learner group using a model that
includes fixed effects of condition and proficiency and the interaction of these two effects. The
random-effect structure of the model was the same as that of the joint model described earlier.
The L2 learners performed less accurately on passive sentences than active sentences (estimate =
–1.29, Wald Z = –2.10, p = .026). There was no effect of proficiency. Table 3 shows the results
For each participant, the eye-tracking data was analyzed only for the sentences for which
they had clicked the picture object correctly, following the most often employed procedure for
selecting the measure of performance in behavioral tasks. This process led to the exclusion of 2%
of the native-speaker data and 16% of the L2-learner data. The average track loss was 14% for
the native-speaker group and 13% for the L2-learner group. There was no trial with more than 40%
track loss, and therefore, all data points were submitted to analysis. For the analysis of the verb
disambiguation region, we included a total of 107 data points from the native speakers (59 target-
initial trials and 47 distractor-initial trials) and 88 data points from the L2 learners (38 target-
To assess the time course of processing, we used growth curve analysis, adapted for
visual-world data (Mirman, Dixon & Magnuson, 2008). Growth curve analysis is a multilevel
regression technique that allows for the estimation of inter- and intra-individual patterns that
change over time (Raudenbush & Bryk, 2002). Although there is no consensus on the manner in
which time course data from the visual-world method should be analyzed (see the special issue
of the Journal of Memory and Language, Volume 59, 2008, for topics on eye-tracking data
analysis), our data characteristics are in line with the assumptions of growth curve modeling.
With growth curve modeling, it is possible to analyze processing trajectories that change
nonlinearly over time, observed in visual-world paradigm data when the proportion of fixations
at the beginning is low and the curve levels off at the tails. The time effect was represented using
characteristics of the data (Mirman et al., 2008). For example, the constant term captures the
overall effect of condition for the entire analysis window; this is important because the critical
region in the present experiment was in the middle of the sentence, and there may have been
some preexisting effects at the onset of the analysis window. Similarly, the linear term captures
the overall rate of increase or decrease of the effect over time. Including a higher-order term does
not change the value or the interpretation of the estimated lower-order terms (Bollen, 2007),
The analysis was conducted on fixations to the active picture scenes in 100-ms bins for
the critical-region duration of 1200 ms. First, we analyzed all participants in a single model,
directly comparing the effect between the native speakers and the L2 learners. We used treatment
coding for both fixed effects: for condition, with the active condition as the baseline, and for
25
group, with the native-speaker group as the baseline. This choice of coding is suitable because
we are interested in examining the effect of condition, controlling for group, and the effect of
group, controlling for condition. Having the active condition and the native-speaker group as
baselines helps us determine how native speakers of Japanese process active and passive
sentences and whether L2 learners’ processing pattern is different from that of native speakers
when comprehending active and passive sentences. Our model included fixed effects of group
and condition and the interaction of these two effects, a second-order orthogonal polynomial (the
linear and quadratic terms), crossed with the fixed effects and their interaction. The model also
included a random slope for participant, together with by-participant random slopes for condition,
crossed with the linear and quadratic terms. By plotting the fixation trajectories, we judged that
the second-order orthogonal polynomial sufficiently characterizes the data, and therefore, the
cubic term was excluded (for a discussion of interpreting cubic terms in cognitive psychology
data, see Mirman et al., 2008). We analyzed the L2-learner data separately, using a model with
the fixed effects of condition and proficiency and their interaction, crossed with the linear and
quadratic terms. The random-effect structure was the same as in the joint model. In modeling the
effect structure, we followed the “keep it maximal” approach (Barr, Levy, Scheepers & Tily,
2013). All growth curve analyses in the current study were performed using the lme4 package in
the statistical software environment R (Bates, Maechler, Bolker & Walker, 2014).
To assess whether the native Japanese speakers and the L2 learners were sensitive to
track the participants’ eye movements separately for target-initial trials—that is, those in which
26
the participants started by looking at the passive picture scene—and distractor-initial trials—that
is, those in which the participants began by looking at the active picture scene (Fernald, Pinto,
Swingley, Weinberg & McRoberts, 1998; Fernald, Zangl, Portillo & Marchman, 2008). These
trials were divided based on the fixation location at the onset of the verb. In successful
comprehension, the participants who start with looks at the target picture scene should keep
looking at the target, whereas the participants who initially look at the distractor should shift
We used treatment coding for the location of initial gaze (i.e., initial gaze), with the
target-initial trials as the baseline, and also used treatment coding for group, with the native-
speaker group as the baseline. We used growth curve analysis of the verb disambiguation region
with the fixed effects of group and trial type and interaction of group by trial type, crossed with a
second-order orthogonal polynomial. The random-effect structure included a random slope for
participant, by-participant random slopes for trial type, crossed with the linear and quadratic
terms.
Results
Figure 3 (a) illustrates the native speakers’ looks to the active picture scenes aggregated by
condition in each 100-ms bin. There was a significant effect of condition (estimate = –0.26, t(40)
= –4.63, p < .001), which captures the difference in the looks to the active picture scenes
between the active and the passive conditions in the native speaker group; the negative estimate
indicate that the looks to the active picture scenes in the passive condition were made less
frequently than those in the active condition. There were significant effects on the linear term
27
(estimate = 0.21, t(40) = 2.30, p = .026) and the quadratic term (estimate = –0.13, t(40) = –2.69,
p = .010). These significant effects suggest that, in the active condition, the native speakers’
looks to the active picture progressively increased. Similarly, the effects of condition on the
linear term was significant (estimate = –0.38, t(40) = –2.78, p = .008); in particular, the negative
estimate for the linear term indicates that the rate at which the native speakers looked at the
active picture scenes was lower in the passive condition than in the active condition. The effect
of condition on the quadratic term was also significant (estimate = 0.26, t(40) = 4.32, p < .001),
The model also demonstrated a significant effect of group (estimate = –0.17, t(40) = –
2.88, p = .006), which suggests that, in the active condition, the L2 learners looked at the active
picture scenes less than the native speakers did. The interaction of group and condition was also
significant (estimate = 0.19, t(40) = 2.35, p = .023), indicating that the differences between the
looks to the active picture scenes in the active condition and those in the passive condition vary
substantially between the two groups. However, in the active condition, there was no effect of
group on the two time terms. In the passive condition, there was no effect of group on the linear
term, but the group effect on the quadratic term was significant (estimate = –0.35, t(40) = –4.08,
p < .001), capturing the curve difference between the native speakers and the L2 learners in the
passive condition. Figure 3 (b) illustrates the L2 learners’ looks to the active picture scenes
A separate analysis was conducted on the L2 learners’ fixations to the active picture
scenes. We did not find a significant difference in the looks to the active picture scenes between
28
the active and the passive conditions; there was no effect of condition on the intercept, the linear,
or the quadratic term. Neither the effect of proficiency nor the interaction effect of proficiency
and condition was significant on the intercept or on the linear term. However, proficiency
showed a strong effect on the quadratic term (estimate = –0.01, t(20) = –2.40, p = .026). The
negative coefficient of proficiency on the quadratic term indicates that, in the active condition,
the more proficient the learners were, the higher the rise-and-fall rate of looks to the active
scenes they exhibited (i.e., the curvature is downwards). Furthermore, the interaction effect of
proficiency and condition on the quadratic term was also significant (estimate = 0.02, t(20) =
3.21, p = .004). The positive estimate suggests that the effect of proficiency on the rise-and-fall
rate in the passive condition was smaller than in the active condition (i.e., the curvature is
Figure 4 shows the onset-contingent plot for the verb disambiguation region. We performed a
growth curve analysis on the native speakers’ and the L2 learners’ proportions of looks switched
in the passive condition, and our model compared the differences in the time course of
processing in the target-initial and distractor-initial trials of those two groups. The effect of
initial gaze was significant on the intercept term (estimate = 0.39, t(27.31) = 5.96, p < .001),
indicating that the native speakers made more shifts to the correct picture scenes in the
distractor-initial trials than in the target-initial trials. The native speakers’ looks to the correct
picture scenes stayed flat in the target-initial trials; however, the looks in the distractor-initial
trials increased progressively over time, which is captured by the significant effect of initial gaze
29
on the linear term (estimate = 1.02, t(23.55) = 6.31, p < .001). The effect of initial gaze on the
quadratic term was not significant. The native speakers’ processing trajectory shows that as soon
as they received the information from the passive morpheme, they discriminated the two picture
The model also showed that, on the intercept term, neither the effect of group nor the
interaction effect of group and initial gaze was significant. These results suggest that the overall
proportions of shift were not reliably different between the two groups. The rate at which the L2
learners erroneously shifted away from the target picture scenes was slightly higher than that of
the native speakers, but this difference was not statistically significant; this is indexed by the
interaction effect of group and initial gaze being not significant on the linear term or on the
quadratic term. Similarly, there was no significant interaction effect of group and initial gaze on
the intercept term, but the same interaction effect of group and initial gaze on the linear term was
significant (estimate = –0.66, t(26.04) = –2.78 , p = .009). This indicates that the rate at which
the L2 learners correctly shifted to the target picture scenes in the distractor-initial trials was
lower than that of the native speakers. This interaction effect on the quadratic term was not
significant. From a visual inspection of Figure 4, the L2 learners seemed to exhibit more false
alarms in the target-initial trials than the native speakers were, but such a difference did not reach
the statistical significance. The current analysis, however, demonstrated that the L2 learners
integrated the passive morpheme information and looked at the target picture scenes in a manner
indistinguishable from that of the native speakers. A summary of the results is shown in Table 6.
In the present study, we investigated whether Japanese comprehenders use knowledge of its case
system for incrementally assigning thematic roles and activating the representations of upcoming
verb voice and examined how they disambiguate their initial interpretation when arriving at
sentence-final verbs.
upcoming verb voice. In line with the previous results of Kamide and colleagues (2003), the
present study demonstrated that the native speakers used the information from case markers to
assign thematic roles prior to hearing the verb. Once a sequence of two nouns was perceived, the
native speakers’ fixations to the active picture scenes increased in the active condition and
decreased in the passive condition progressively over time and quickly leveled off to the
asymptote. These processing patterns suggest that the native speakers committed to real-time
structure building in the preverbal position and predicted verb voice. However, the L2 learners
did not seem to use case-marked nouns to assign thematic roles in the preverbal position in a
manner comparable to that of the native speakers. The learners’ processing patterns during the
critical region show that, compared to the native speakers, the L2 learners were less likely to
look at the active picture scenes in the active condition, and were more likely to look at the
active picture scenes in the passive condition. This pattern of results suggests a lack of preverbal
We also examined how the information from passive verb morphology feeds into real-
time comprehension. The analysis of the verb disambiguation region suggests that both the
native-speaker and the L2-learner groups clearly distinguished passive sentences from active
sentences; the target-initial trials were associated with a lower rate of shifts than were the
31
distractor-initial trials. This pattern, in turn, indicates that the L2 learners were able to use the
passive morpheme to drive their visual inspection patterns in a manner qualitatively equivalent to
that of the native speakers. Our results provide clear behavioral evidence that certain
grammatical cues present challenges for L2 learners to deploy during real-time processing,
whereas other cues can be used in a manner highly comparable to that of native speakers.
predictive processing. The issues in regard to limited resources, pointed out by the RAGE
hypothesis (Grüter et al. 2014, 2016), and to stored differences of frequency information (Kaan,
2014) guide us to further consider what would make a certain linguistic cue drain processing
resources and introduce L1–L2 differences in incremental processing. There are certainly many
possibilities, in light of the fact that Japanese passives present a number of structural differences
compared with passives in English, the learners’ L1. Such cross-linguistic differences will
contribute to the difficulty for the L2 learners; comprehenders naturally have biases that reflect
First, the L2 learners’ lack of predictive processing can be attributed to the difficulty of
the Japanese case system for L2 learners, which is congruent with the findings of the competition
model studies (for a review, see Sasaki & MacWhinney, 2006). Because of the inherent
difficulty, it is possible that L2 learners have partial knowledge of case markers in general.
Perhaps, then, such partial knowledge of case markers is an artifact of the difficulty associated
with linking form and meaning. DeKeyser (2005) argued that when distinct forms express the
same meaning and the same form expresses distinct meanings, it makes establishing the link
between form and meaning difficult for L2 learners (i.e., an opacity of form and meaning) and
thus makes retrieval difficult. The Japanese case system seems a relevant instance of such
32
opacity. For instance, the nominative case marker –ga must appear in each clause; therefore, –ga
is considered the default case (Fukui, 1986). However, this default status can also induce
mappings of one form to multiple interpretations. In regard to the present investigation, the case
alternation in the active and passive voices presents opacity: In passive sentences, the
nominative-marked noun is a logical object (i.e., a theme or patient) that, instead, functions as
the grammatical subject. Furthermore, some grammatically simple and frequent structures also
jeopardize the link between case markers and thematic roles. Nominative object constructions—
for example, a transitive adjective, such as suki ‘fond of’ and a complex predicate consisting of
an action verb and an auxiliary adjective tai ‘want’—are cases in point. All of these instances of
logical objects marked in the nominative case make the link between case markers and thematic
roles hard to establish; the weak form–meaning mapping gives rise to low cue reliability
By extension, the account of opacity in form–meaning mapping and its retrieval difficulty
may help us reconcile why the L2 learners in the present study did not use the information from
the case markers, whereas some of the previous studies in German had demonstrated L2 learners’
nativelike use of case information (Hopp, 2009; Jackson & Dussias, 2009). For example, when
processing German subject- and object-extracted wh-questions (e.g., ‘Who do you think admired
the athlete after the game?’ or ‘Who do you think the athlete admired after the game?’), the
initial wh-words were immediately and unambiguously coded as the subject or direct object (i.e.,
wer ‘who’ or wen ‘whom’) of the complement clause (Jackson & Dussias, 2009). In contrast,
when processing temporarily ambiguous sentences (‘Which engineer met the chemist?’ or
‘Which engineer did the chemist meet?’), in which the wh-word presents the opacity (i.e.,
Welche-NOM or Welche-ACC), the effect of case markers was not observed in L2 (Hopp, 2006;
33
Jackson, 2008). In the present study, the participants were first required to deal with the opacity
of each case marker and then were required to attend to the joint cues of information posed by
two nouns to anticipate verb voice. The nature of prediction required here is susceptible to
processing difficulty because of the multiple levels of ambiguity. We believe that, in future
studies, measuring the opacity of form–meaning relationships and the associated difficulty of
retrieval may provide a more general account of L2 predictive processing, because it potentially
provides an idea of how the reduced ability in anticipation interacts with the design features of a
language.
combination of case-marked preverbal noun phrases and passive verbs. Even though the analysis
of the verb disambiguation region showed that the L2 learners successfully integrated the passive
morpheme, quite clearly, holding two preverbal nouns without assigning thematic roles imposes
a heavy load on working memory by the time the disambiguating verb is encountered
(Harrington & Sawyer, 1992; Just & Carpenter, 1992). The low response accuracy scores of L2
information and integrating it with predicted verb forms. Integration difficulty of Japanese
passive voice can be considered at least from two perspectives. One is to assume that success in
preverbal processing is a prerequisite for success in processing the sentence-final verb. Mazuka
and Itoh (1995) argued that native Japanese speakers assign thematic roles tentatively during
preverbal processing. After the verb arrives, the assignment of thematic roles is reevaluated with
the added information from the verb. The findings from the present study can be interpreted as
supporting that the converse of Mazuka and Itoh’s (1995) argument may also hold. In particular,
successful comprehension requires that the components of sentences be integrated in real time
34
and the earlier structural hypothesis be evaluated as compatible or not later on. The need for this
later compatibility check presents an obvious difficulty when the comprehender does not have a
set of initially constructed representations that he or she checks the compatibility against.
Another perspective is to view that the integration difficulty resides in the structure of passive
construction itself. Passives require thematic roles to be assigned in an atypical order (i.e.,
protopatient before protoagent), and it is known to be a challenge to processing even for adult
native speakers (Ferreira, 2003). Furthermore, a series of recent studies demonstrated that the
order of linguistic cues also contributes to acquisition and processing difficulty (Choi &
Trueswell, 2010; Pozzan & Trueswell, 2015; Trueswell, Kaufman, Hafri & Lidz, 2012). Children
comprehending causative verbs than children learning Tagalog, a verb-initial language that also
has causative verb morphology (Trueswell et al., 2012). Similarly, Pozzan and Trueswell (2015)
maintained that linguistic cues that become available late in the sentence are more difficult to
acquire than early-arriving cues by adults learning a miniature artificial language. The current
study, however, has shown that case markers, a cue early in the stimulus, were not efficiently
used by the L2 learners, but passive morphology, a later cue, was. This suggests that the order of
linguistic cues alone may not be a predictor of processing difficulty by L2 learners, at least in the
Our results also give rise to a number of issues that warrant further investigation. One of
these involves the effect of L2 proficiency. From the visual inspection of the critical region
(Figure 3), one observes that the learners’ fixation patterns started to diverge at around 500 ms
after the onset of the critical region, and these trends continued to the verb disambiguation region.
These fixation patterns can be taken as a sign of emerging anticipatory processing by L2 learners;
35
it is delayed but qualitatively comparable to that of native speakers. Additionally, the significant
effects of proficiency on the quadratic term (i.e., the effect of proficiency on the quadratic term
and effect of the interaction of proficiency and condition on the quadratic term) provide partial
However, the picture is not complete; proficiency alone did not surface as a significant
predictor, but it influenced the rise-and-fall rate of the fixation trajectory. If proficiency is a
critical factor in predicting whether L2 learners can exhibit nativelike processing in Japanese, we
should have expected the effect of proficiency and the interaction of proficiency and condition
on the intercept and the linear terms too. One speculative explanation of what may be causing
this incompleteness is the small divergence in L2 proficiency; our L2 participants were recruited
from college-level Japanese courses, and thus differences in learning experience are as small as a
predictive processing and including participants with varying proficiency may well help us
Secondly, in this study, we demonstrated that L2 learners were able to integrate passive
whether such a comprehension difficulty evolved because of thematic reanalysis or its interplay
with syntactic reanalysis. A passive sentence involves thematic reanalysis and syntactic
reanalysis (for theoretical assumptions, see Hoshi, 1991, 1999; Miyagawa, 1999). Hirotani and
colleagues (2011) compared neurocognitive processes when native Japanese speakers process
active, passive, and causative sentences. The participants’ neural network showed sensitivity to
36
thematic reanalysis different from that to syntactic reanalysis. In future research, the inclusion of
sentences that involve the same sequence of preverbal nouns with varying verb types, such as
transfer datives (i.e., no reanalysis is required), causatives (i.e., thematic reanalysis is required),
and passives (i.e., thematic and syntactic reanalysis is required), is much needed to tease apart
Controlling the sentence condition more systematically, we may be able to reason more
clearly about what, precisely, is happening in the preverbal processing. With the choice between
only active and passive voice, although filler sentences were included, the native Japanese
speakers may have used case-marker cues strategically by locking in on the idea that
of preverbal nouns consistent allows us to distinguish this type of keyword-driven response from
diversion between the active and passive conditions roughly around the offset of the first noun
and slightly before the offset of the second noun.8 This fixation pattern is not fully explicable by
the default thematic assignment and combinatory frequency information of two case marked
nouns; one would hypothesize that the proportion of the fixation to the active picture scenes
would have increased in both the active and the passive conditions. This study focused on the
adverbial region, because we anticipated that the processor needed some buffer to generate
Japanese has a diverse set of structural choices given by case markers at different points in
sentences (Chang, 2009), and therefore, in future studies, linguistic conditions leading to the
systematically interpreted.
37
incrementally assign thematic roles before encountering a sentential-final verb and generate
predictions of that verb’s voice. The results demonstrated that, although the native Japanese
with a predicted verb voice aided by case markers, the L2 learners were less committed in their
assignment of thematic roles to the preverbal nouns. Nevertheless, the L2 learners were able to
integrate information from the passive morpheme, successfully discriminating active and passive
sentences. These results highlight the usefulness of case markers in predictive processing in
Japanese; L2 learners are less efficient in using this cue, but they use information indicated by
Footnotes
1
The inflectional abbreviations in this paper are ACC, accusative case; COP, copula;
DAT, dative case; FEM, feminine gender; HSY, hearsay (i.e., reported evidential); MASC,
masculine gender; NOM, nominative case; PAST, past tense; PASS, passive voice; PL, plural.
2
Japanese has another type of passive, known as the indirect passive (Shibatani 1990;
Taujimura 1996) and also called the adversative passive (Kuno 1973). Indirect passives can
involve intransitive verbs, and various sentential elements can be represented as the passivized
subject.
3
We thank two anonymous reviewers for suggesting this interpretation.
4
Including the translation task score as a fixed effect did not change the pattern of the
results.
5
Filler sentences were included in order to divert the participants’ attention from the
objective of the experiment. There were cleft sentences, subject and object relatives, and
sentences involving affirmative and negative polarity adverbials. These fillers were chosen
because they require participants to keep track of various parts of the sentences, such as verb
determinism, the models exhibited strong collinearity between the effect of condition and the
interaction of group and condition. Therefore, we removed the interaction term of group and
condition from the models. Because our statistical analysis lacks a critical comparison, we avoid
from −1700 to −700 ms, and the second noun period was from −600 to 200 ms, with the onset of
the critical region at 0 ms of 800 ms each, which approximately corresponds to the mean
durations of these noun phrases, offset by 200 ms (Matin et al, 1993). We used time-dependent
linear mixed effect modeling, including fixed effects for the condition and time, with the
intercept term centered in the analysis periods. There was a significant effect of intercept in the
first noun period (estimate = 0.35, t = 2.29, p = .021), suggesting that the native speakers looked
at the active picture scenes more in the passive condition in the first noun period, but the
condition effect on the intercept was not significant in the second noun period (estimate = 0.31, t
= 1.80, p = .070). There were no other significant effects. Our speculation here is that the native
speakers mapped the first noun onto the agent in both conditions (Bever, 1970), but they
immediately incorporated various cues provided by the second noun (e.g., prosody, visual scenes,
and plausibility). The fixation diversion observed in the second noun period might be a residual
effect of processing the first noun. We assume that the speed in which the cues are integrated
References
Altmann, G., & Kamide, Y. (1999). Incremental interpretation at verbs: Restricting the domain
Aoshima, S., Phillips, C., & Weinberg, A. (2004). Processing filler-gap dependencies in a head-
Bader, M., & Meng, M. (1999). Subject-object ambiguities in German embedded clauses: An
Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for
confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68,
255–278.
Bates, D. M., Maechler, M., Bolker, B., & Walker, S. (2014). Linear mixed-effects models using
Bever, T. G. (1970). The cognitive basis for linguistic structure. In J. R. Hayes (Ed.), Cognition
Bollen, K. A. (2007). On the origins of latent curve models. In R. Cudeck, R. MacCallum (Eds.),
Factor analysis at 100 (pp. 79–98). Mahwah, NJ: Lawrence Erlbaum Associates.
Borer, H., & Wexler, K. (1987). The Maturation of Syntax. Studies in Theoretical
Psycholinguistics 4, 23–172.
Chang, F. (2009). Learning to order words: A connectionist model of heavy NP shift and
accessibility effects in Japanese and English. Journal of Memory and Language, 61, 374–
397.
Choi, Y., & Trueswell, J. C. (2010). Children’s (in)ability to recover from garden-paths in a
Clancy, P. M. (1985). The acquisition of Japanese. In D. I. Slobin (Ed.), The crosslinguistic study
Dahan, D., Swingley, D., Tanenhaus, M. K., & Magnuson, J. S. (2000). Linguistic gender and
DeLong, K. A., Urbach, T. P., & Kutas, M. (2005). Probabilistic word pre-activation during
1117–1121.
Dowens, M. G., Vergara, M., Barber, H. A., & Carreiras, M. (2010). Morphosyntactic processing
Dufour, R., & Kroll, J. F. (1995). Matching words to concepts in two languages: A test of the
concept mediation model of bilingual representation. Memory and Cognition, 23, 166–180.
Dussias, P. E., Marful, A., Gerfen, C., & Molina, M. T. B. (2010). Usage frequencies of
complement-taking verbs in Spanish and English: Data from Spanish monolinguals and
Dussias, P. E., Valdés Kroff, J. R., Guzzardo Tamargo, R. E., & Gerfen, C. (2013). When gender
and looking go hand in hand. Studies in Second Language Acquisition, 35, 353–387.
Eda, S., Itomitsu, M., & Noda, M. (2008). The Japanese skills test as an on-demand placement
test: Validity comparisons and reliability. Foreign Language Annals, 41, 218–236.
Fanselow, G., Kliegl, R., & Schlesewsky, M. (1999). Processing difficulty and principles of
grammar. In S. Kemper & R. Kliegl (Eds.), Constraints on language: Aging, grammar, and
Federmeier, K. D. (2007). Thinking ahead: The role and roots of prediction in language
Fernald, A., Pinto, J. P., Swingley, D., Weinberg, A., & McRoberts, G. (1998). Rapid gains in
speed of verbal processing by infants in the second year. Psychological Science, 9, 228–231.
Fernald, A., Zangl, R., Portillo, A. L., & Mark. (2008). Looking while listening Using eye
164–203.
Frenck-Mestre, C., & Pynte, J. (1997). Syntactic ambiguity resolution while reading in second
and native languages. The Quarterly Journal of Experimental Psychology, 50A, 119–148.
Gahl, S., Jurafsky, D., & Roland, D. (2004). Verb subcategorization frequencies: American
Garnsey, S., Pearlmutter, N., Myers, E., & Lotocky, M. (1997). The contributions of verb bias
Sciences, 2, 262–268.
Grüter, T., Lew-Williams, C., & Fernald, A. (2012). Grammatical gender in L2: A production or
Grüter, T., Rohde, H., & Schafer, A. (2014). The role of discourse-level expectations in non-
native speakers' referential choices. In W. Orman & M. J. Valleau (Eds.), Proceedings of the
Grüter, T. Rohde, H. & Schafer, A. J. (2016). Coreference and discourse coherence in L2: The
Guillelmon, D., & Grosjean, F. (2001). The gender marking effect in spoken word recognition:
Hakuta, K. (1982). Interaction between particles and word order in the comprehension and
Second Meeting of the North American Chapter of the Association for Computational
Harrington, M., & Sawyer, M. (1992). L2 working memory capacity and l2 reading skill. Studies
Havik, E., Roberts, L., Van Hout, R., Schreuder, R., & Haverkort, M. (2009). Processing subject-
object ambiguities in the L2: A self-paced reading study with German L2 learners of Dutch.
Hirotani, M., Makuuchi, M., Rüschemeyer, S., & Friederici, A. D. (2011). Who was the agent?
The neural correlates of reanalysis processes during sentence comprehension. Human Brain
Hopp, H. (2006). Syntactic features and reanalysis in near-native processing. Second Language
Hopp, H. (2009). The syntax–discourse interface in near-native L2 acquisition: Off-line and on-
Hopp, H. (2013). Grammatical gender in adult L2 acquisition: Relations between lexical and
Hoshi, H. (1991). The generalized projection principle and its implications for passive
Hoshi, H. (1999). Passives. In N. Tsujimura (Ed.), The Handbook of Japanese Linguistics (pp.
Huang, Y. T., Zheng, X., Meng, X., & Snedeker, J. (2013). Children’s assignment of
Itomitsu, M. (1996). Developing Japanese skills test: Theoretical framework for a standardized
proficiency test. Unpublished Master's thesis, Ohio State University, Columbus, OH.
Iwasaki, N. (2008). L2 acquisition of Japanese: Knowledge and use of case particles in SOV and
OSV sentences. In S. Karimi (Ed.), Word order and scrambling (pp. 273–300). Malden,
Jackson, C. (2008). Proficiency level and the interaction of lexical and morphosyntactic
Jackson, C. N., & Dussias, P. E. (2009). Cross-linguistic differences and their impact on L2
Jaeger, F. (2008). Categorical data analysis: Away from ANOVAs (transformation or not) and
towards logit mixed models. Journal of Memory and Language, 59, 434–446.
Jurafsky, D. (1996). A probabilistic model of lexical and syntactic access and disambiguation.
Kaan, E. (2014). Predictive sentence processing in L2 and L1: What is different? Linguistic
Kamide, Y., Altmann, G. T. M., & Haywood, S. L. (2003). The time-course of prediction in
Kilborn, K., & Ito, T. (1989). Sentence processing in Japanese-English and Dutch English
Kimball, J. (1975). Predictive analysis and over-the-top parsing. In J. Kimball (Ed.), Syntax and
Kuno, S. (1973). The structure of the Japanese language. Cambridge, MA: MIT Press.
Lau, E., Stroud, C., Plesch, S., & Phillips, C. (2006). The role of structural prediction in rapid
Lee, E.-K., Lu, D. H.-Y., & Garnsey, S. M. (2013). L1 word order and sensitivity to verb bias in
MacDonald, M. C., Pearlmutter, N. J., & Seidenberg, M. S. (1994). The lexical nature of
Psycholinguistics, 8, 315–327.
MacWhinney, B. (2001). The competition model: The input, the context, and the brain. In P.
Robinson (Ed.), Cognition and Second Language Instruction (pp. 69–90). New York:
Marslen-Wilson, W. D. (1973). Linguistic structure and speech shadowing at very short latencies.
Martin, C. D., Thierry G., Kuipers, J. R., Boutonnet, B., Foucart, A., Costa, A. (2013). Bilinguals
reading in their second language do not predict upcoming words as native readers do.
Matin, E., Shao, K. C., & Boff, K. R. (1993). Saccadic overhead: Information-processing time
Mazuka, R., & Itoh, K. (1995). Can Japanese speakers be led down the garden path? In R.
Mazuka & N. Nagai (Eds.), Japanese sentence processing (pp. 295–329). Hillsdale, NJ:
Minai, U. (2000). The Acquisition of Japanese Passives. In M. Nakayama & C. J. Quinn (Eds.),
Mirman, D., Dixon, J. A., & Magnuson, J. S. (2008). Statistical and computational models of the
visual world paradigm: Growth curves and individual differences. Journal of Memory and
Mitsugi, S., & MacWhinney, B. (2010). Second language processing in Japanese scrambled
and parsing (Vol. 53, pp. 159–175). Philadelphia, PA: John Benjamins.
Mitsugi, S., & MacWhinney, B. (2016). The use of case marking for predictive processing in
Kuroshio Shuppan.
Nakano, Y., Felser, C., & Clahsen, H. (2002). Antecedent priming at trace positions in Japanese
[On the development of viewpoint expressions used by intermediate and advanced level
Pozzan, L., & Trueswell, J. C. (2015). Revise and resubmit: How real-time parsing limitations
Rounds, P. L., & Kanagy, R. (1998). Acquiring linguistic cues to identify agent. Studies in
Sano, T., Endo, M., & Yamakoshi, K. (2001). Developmental issues in the acquisition of
Japanese unaccusatives and passives. A. H.-J. Do, L. Domínguez, & A. Johansen (Eds.), In
Sasaki, Y. (1994). Paths of processing strategy transfers in Japanese and English as foreign
languages: A competition model approach. Studies in Second Language Acquisition, 16, 43–
72.
49
Sasaki, Y., & MacWhinney, B. (2006). The competition model. In M. Nakayama, R. Mazuka, &
Y. Shirai (Eds.), The handbook of East Asian psycholinguistics, Vol. 2: Japanese (pp. 307–
Sato, F. (1997) Eigo o bogo to suru nihongo gakushuusha no danwa bunseki [A discourse
analysis of learners of Japanese with the first language English background]. Proceedings
Shibatani, M. (1988). Voice in Philippine languages. In M. Shibatani (Ed.), Passive and voice
Staub, A., & Clifton, C., Jr. (2006). Syntactic prediction in language comprehension: Evidence
from either... or. Journal of Experimental Psychology: Learning, Memory and Cognition, 32,
425–436.
10, 145–156.
Tanaka, M. (1998). Factors affecting the acquisition of point of view, voice, and complex
sentence. In M. Tanaka (Eds.), The acquisition of point of view and voice in Japanese as a
Grants-in-Aid for Scientific Research Final Research Report Summary (Project Number:
08680323).
Tanenhaus, M. K. (2007). Eye movements and spoken language processing. In G. Gaskell (Ed.),
Tanenhaus, M. K., Spivey-Knowlton, M. J., Eberhard, K. M., & Sedivy, J. C. (1995). Integration
of visual and linguistic information in spoken language comprehension. Science, 268, 1632–
1634.
Trueswell, J. C., Kaufman, D., Hafri, A., & Lidz, J. (2012). Development of parsing abilities
interacts with grammar learning: Evidence from Tagalog and Kannada. In A. K. Biller, E. Y.
Chung, & A. E. Kimball (Eds.), Proceedings of the 36th annual Boston University
Shuppan.
Yamashita, H. (1997). The effects of word-order and case marking information on the processing
Two versions of each experimental sentence were created, corresponding to each of the
experimental conditions, as seen in sentence (1) below. For the remaining sentences, only
(1) a.
I heard that the child hastily woke up the father.
b.
I heard that the child was hastily woken up by the father.
(2)
I heard that the woman badly hit the man.
(3)
I heard that the elderly man thoroughly washed the boy.
(4)
I heard that the elderly woman slowly massaged the elderly man.
(5)
I heard that the boy suddenly hugged the girl.
(6)
I heard that the mother immediately called the child.
(7)
I heard that woman quickly killed the man.
(8)
I heard that woman extensively praised the man.
(9)
I heard that the man suddenly scolded the woman.
(10)
I heard that the girl badly bullied the boy.
(11)
I heard that the girl strongly kicked the boy.
(12)
I heard that the elderly man slowly pushed the elderly woman.
52
Measure M % SD M % SD
Years in Japan
None 9 45
1–2 year 7 35
Active! 882! 83! 860! 67! 508! 77! 1170! 140! 4309! 258!
Passive! 831! 108! 844! 90! 514! 53! 1305! 135! 4432! 269!
Mean! 872! 74! 838! 98! 511! 65! 1237! 151! 4371! 265!
Note. A 200-millisecond (ms) slice of silence was added between the offset of second noun and
the onset of adverbial phrase, and a 500-ms slice was added between the offset of the adverbial
Table 3. Fixed effects in the logit mixed model of response accuracy in native speakers and in L2
learners.
Coefficient SE Wald Z p
All participants
L2 learners
Table 4. Overall growth curve analysis on fixations to the active picture scenes for the critical
region.
Estimate SE t p
Group (L2) ×!Condition (Passive) × Quadratic –0.35 0.08 –4.08 < .001
56
Table 5. Analysis of L2 learners’ fixation to the active pictures for the critical region.
Estimate SE t p
Table 6. Analysis of the rate of fixation changed to the target by the native speakers and the L2
Estimate SE t p
Group (L2) × Initial gaze (distractor-initial) × Linear –0.66 0.23 –2.78 .009
Group (L2) × Initial gaze (distractor-initial) × Quadratic –0.04 0.16 –0.27 .782
58
1200 ms
Critical region
0 ms
(Second noun offset)
Conditions
0.75
●●●●●●
●●
●● ●
● ●●
Probabilities
● ● Condition
●
0.50 ● ● ●●● ● Active
● ●
● ● ● ● Passive
● ●●
●● ● ●
●●● ● ●●
● ●
●
0.25
●
●
●●
●●●
0.75
Probabilities
Condition
0.50 ● ● Active
● ● ● ●● ●●●
●● ●●●●● ● ● ● Passive
●
● ●●●●●● ●
●
● ● ●●
●●● ●
● ●
●
●●●●●
●●
●●●
0.25 ●
Figure 3. The Japanese native speakers’ and the L2 learners’ proportions of fixation to the active
picture, aggregated by condition in each 100-millisecond (ms) time bin for the critical region,
which includes the presentation of an adverbial phrase, following two preverbal nouns.
61
1.00
● ●
●
●
●
● ●
●
0.75 ●
Proportion of looks switched
● Initial gaze
Distractor−initial
● Target−initial
0.50
Group
● Native speakers
L2 learners
●
●
0.25
●
●
● ● ● ● ●
● ● ● ●
● ● ●
●
● ●
●
●
●
●
0.00 ●
Figure 4. Onset-contingent plot of distractor-initial and target-initial trials for the passive
condition by the native Japanese speakers and the L2 learners, measured from the
disambiguating verb’s onset. At each 100-millisecond (ms) interval, the data points show the
mean proportion of trials on which the participants shifted from the picture object they started on
to look at the other picture object. The left dashed line shows the average offset of the verb
phrase, and the dotted lines shows the average exit time of an experimental trial for the native