Anda di halaman 1dari 58

Language Learning

ISSN 0023-8333

Age of Onset and Nativelikeness


in a Second Language: Listener
Perception Versus Linguistic Scrutiny
Niclas Abrahamsson and Kenneth Hyltenstam
Stockholm University

The incidence of nativelikeness in adult second language acquisition is a controversial


issue in SLA research. Although some researchers claim that any learner, regardless
of age of acquisition, can attain nativelike levels of second language (L2) proficiency,
others hold that attainment of nativelike proficiency is, in principle, impossible. The
discussion has traditionally been framed within the paradigm of a critical period for
language acquisition and guided by the question of whether SLA is constrained by the
maturation of the brain. The work presented in this article can be positioned among
those studies that have focused exclusively on the apparent counterexamples to the
critical period. We report on a large-scale study of Spanish/Swedish bilinguals (n =
195) with differing ages of onset of acquisition (<147 years), all of whom identify
themselves as potentially nativelike in their L2. Listening sessions with native-speaker
judges showed that only a small minority of those bilinguals who had started their L2
acquisition after age 12, but a majority of those with an age of onset below this age,
were actually perceived as native speakers of Swedish. However, when a subset (n =
41) of those participants who did pass for native speakers was scrutinized in linguistic
detail with a battery of 10 highly complex, cognitively demanding tasks and detailed
measurements of linguistic performance, representation, and processing, none of the
late learners performed within the native-speaker range; in fact, the results revealed also

This study was made possible by a research grant to K. H. and N. A. from The Bank of Sweden
Tercentenary Foundation (grant No. 1999-0383:01). We are greatly indebted to the participants of
the study, who without hesitation agreed to go through the 4-hour long and quite demanding test
session. We would also like to thank Johan Roos for carrying out the testing and data collection,
Katrin Stolten for doing the VOT analyses, our colleagues at the Centre for Research on Bilingualism at Stockholm University as well as the anonymous Language Learning reviewers for their
comments on earlier versions of this article, and Thomas Lavelle for correcting and improving our
English writing.
Correspondence concerning this article should be addressed to Niclas Abrahamsson, Centre
for Research on Bilingualism, Stockholm University, SE-106 91 Stockholm, Sweden. Internet:
niclas.abrahamsson@biling.su.se

Language Learning 59:2, June 2009, pp. 249306



C 2009 Language Learning Research Club, University of Michigan

249

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

that only a few of the early learners exhibited actual nativelike competence and behavior
on all measures of L2 proficiency that were employed. Our primary interpretation of the
results is that nativelike ultimate attainment of a second language is, in principle, never
attained by adult learners and, furthermore, is much less common among child learners
than has previously been assumed.
Keywords adult second language acquisition; age of onset; critical period hypothesis
(CPH); listener perception; maturational constraints; multiple-task design; nativelikeness; near-nativeness; L1 Spanish; L2 Swedish

Introduction
In Larry Selinkers seminal article Interlanguage (Selinker, 1972), published
during the first phase of theory development in second language acquisition
(SLA), a number of central concepts were discussed that together came to play a
crucial role in second language (L2) research. Although the term interlanguage
had already been introduced a few years earlier (Selinker, 1969), it was through
the 1972 article that it became established as a general term referring to the
separate linguistic system responsible for the learners observable version of
the L2. In the same article, the term fossilization was introduced, as well as
the idea of a number of psycholinguistic, or cognitive, processes governing the
successive growth and change of the interlanguage.
In his article, which focused exclusively on adult L2 learners, Selinker
also dealt with the question of the relatively few individuals who despite a
late age of onset of acquisition succeed in reaching levels of L2 competence
comparable to that of native speakers; there, Selinker talked about absolute
success (1972, p. 212). In this context, he suggested that these individuals
may constitute approximately 5% of all adult L2 learners. The reason for
mentioning these learners only in passing was that he wanted to exclude them
from the domain of SLA research. These individuals, Selinker argued, are so
unique and make use of such different psychological processes in their learning
that they need not be considered at all in L2 theory building; these successful
learners may be safely ignored (p. 212).
Despite the obvious arbitrariness of Selinkers 5% estimate, it has been
perpetuated by the SLA literature numerous times over the years. As an effect,
many students of SLA, including researchers, have treated Selinkers guess
more or less as an established fact. On the other hand, there are researchers
who have questioned the 5% figure, for different reasons and from different angles. Although some have indeed suggested that a much larger number
(say 1015%) of adult learners reach nativelike levels in the L2 (see, e.g.,
Language Learning 59:2, June 2009, pp. 249306

250

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

Birdsong, 1999, 2005a; Seliger, Krashen, & Ladefoged, 1975), others claim
that entirely nativelike adult L2 learners do not exist at all. This latter position is taken by, for example, Bley-Vroman (1989), who holds that adult
L2 acquisition comes about through general, cognitive learning strategies, as
opposed to the linguistically domain-specific principles that govern childrens
acquisition of a first language (L1). To learn a language fully on the basis of
general cognitive learning strategies alone is, according to Bley-Vroman, not
only difficult but impossible, and, therefore, virtually no normal adult learner
achieves perfect success, if what one means thereby is development of nativespeaker competence (p. 44). However, if absolute success does occur in a
few, rare adult L2 learners, this could, according to Bley-Vroman, be given
the same pathological status as the exceptional phenomenon of failure in L1
acquisition.
Bley-Vromans work (1989) can be said to be representative for the Universal Grammar (UG) paradigm and for those researchers who argue that adult L2
learners no longer have access (or only partial access) to the innate universal
principles and constraints that are responsible for language development (e.g.,
Epstein, Flynn, & Martohardjono, 1996; Eubank & Gregg, 1999; Schachter,
1989). These researchers all take the theoretical position that adult L2 speakers competence is different from that of L1 speakers; adult language learning
simply does not lead to absolute nativelikeness. Gregg (1996) formulated this
idea quite categorically when claiming that truly native-like competence in an
L2 is never attained (p. 52). Similarly, there are researchers outside the UG
paradigm who, on both theoretical and empirical grounds, suggested that the
number of absolute nativelike adult learners should be zero; for example, Long
and Robinson (1998) assumed that the maximal level of L2 attainment should
be what is frequently labeled near-native rather than nativelike, which is
an assumption that we ourselves have made previously.1
The existence or nonexistence of late, nativelike L2 learners has generally
been discussed in relation to the critical period hypothesis (CPH; Lenneberg,
1967). If such individuals do exist, many researchers claim that they would
constitute the evidence necessary to reject the CPH or any other hypotheses
proposing biological/maturational constraints on language learning. In fact,
Long (1990) argued that one single post-critical-period L2 learner with an
underlying competence indistinguishable from that of native speakers would
suffice to reject the CPH. The work presented in this article can be positioned
among those studies that have focused exclusively on the apparent counterexamples to the critical period, in order to test the hypothesis that language
acquisition is maturationally constrained.
251

Language Learning 59:2, June 2009, pp. 249306

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

Previous Research on the Incidence of Nativelikeness


Since the late 1960s, a large number of studies have compared groups of
early and late learners ultimate attainment of an L2 (e.g., Asher & Garca,
1969; Johnson & Newport, 1989; Munro & Mann, 2005; Patkowski, 1980).
In general, such studies have found a strong negative correlation between
L2 learners age of onset (AO) of acquisition and some measure of their L2
proficiency; whenever nativelike bahavior has been observed, this has been
associated exclusively with younger learners. Of course, the prospect of finding
highly advanced, potentially nativelike adult learners is rather small when using
samples of randomly selected individuals with varying degrees of ultimate
attainment. For example, the study by Johnson and Newport of 46 Chinese
and Korean learners of L2 English demonstrated not only a strong negative
correlation (r = .77) between AO of L2 acquisition and scores on an English
276-item grammaticality judgment test (GJT) but also that no participants with
AO beyond 7 years scored within the native-speaker range. Similarly, in their
partial replication of the Johnson and Newport study, Bialystok and Miller
(1999) found participants in two learner groups (L1 Spanish and L1 Chinese)
who performed like English native speakers on the GJT until AO 8 years,
whereas no participants with AO beyond this age were reported to score within
the range of native speakers.
However, in various other replications of the Johnson and Newport study
(1989; henceforth J&N89), late learners have indeed been found to perform
within the native-speaker range. What these replications have in common is
that the selection of participants departs from the original study in some crucial
ways, the two most important adjustments being, first, the extension of the
minimum length of residence in the host country, from J&N89s 5 years to
at least 10 years, and, second, the choice of participants with L1s other than
Chinese and Korean. In a partial replication using two different groups of
Dutch learners of English, all of whom had begun their L2 learning after age
12, Van Wuijtswinkel (1994) reported 8 of 25 learners in one learner group
and 7 of the 8 learners in another group with performance scores within the
range of native-speaker performance. In their study of 200 Korean learners of
English, Flege, Yeni-Komshian, and Liu (1999) found six participants with AO
12 who performed like natives on a subset of J&N89 stimulus sentences
(although none with an AO beyond 16), but in pronunciation tests, they found
no L2 participants with AO above 9 who spoke English without a detectable
foreign accent. Furthermore, Birdsong and Molis (2001) found in their J&N89
replication with Spanish learners of English that more than 20% of the late

Language Learning 59:2, June 2009, pp. 249306

252

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

learners (defined as those with AO 17 years) performed within the J&N89


native-speaker range on the GJT; in fact, a majority of the participants with AO
12 years performed within this range. Finally, in a replication with a subset
of the J&N89 sentences but with Hungarian learners of English and no native
controls, DeKeyser (2000) identified 10 out of 42 late learners who scored
within the range of child learners. However, in the absence of native control
speakers, it is difficult to evaluate the incidence of nativelike performance
among DeKeysers learners, although a qualified guess is that native speakers
would score at or close to ceiling, as has been the case in the original J&N89
study and in the various other replications.
On the basis of these and similar studies, one might be tempted to conclude that nativelike attainment of an L2 is indeed possible, even common,
among individuals who started their acquisition after childhood. However, as
we have argued earlier (see Hyltenstam & Abrahamsson, 2003b; see also 2000,
2001), postpuberty (including adult) learners may well attain the same linguistic
knowledge and exhibit the same linguistic behavior as native speakers in certain
(limited) areas of the target language without thereby being indistinguishable
from mother-tongue speakers in all relevant respects. Our claim is that much
of the research that is frequently taken as evidence for the existence of late, nativelike L2 learners suffer from Type II errors because it has either been based
on language tests that are too easy and involve quite simple structures (e.g., the
J&N89 GJT) or because language production data have not been analyzed in
sufficient detail. Both of these shortcomings tend to result in ceiling effects and
unwarranted claims for nativelikeness (Hyltenstam & Abrahamsson, 2003b,
p. 570; Long, 2005).
Therefore, what appears to be more compelling evidence for adult-learner
nativelikeness can be found in studies that have focused exclusively on late,
high-proficiency L2 speakers who have been preselected, or screened, for potentially nativelike verbal behavior. Characteristic of these studies is that they have
employed quite sophisticated techniques for linguistic scrutiny, either through
(a) great stringency and detail of the analyses, (b) demanding tests and tasks
(e.g., through the choice of unusual target-language structures that are known
to be difficult for learners), and/or (c) the use of multiple-task designs covering various linguistic domains rather than one or a few isolated structures,
phenomena, or domains. These methodological features will be illustrated next
through a review of a sample of studies.
In a pronunciation study by Moyer (1999), in which 24 highly proficient
and highly motivated American learners of German participated, four speech
elicitation techniques were used, representing four different speech modes:
253

Language Learning 59:2, June 2009, pp. 249306

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

word-list reading, sentence reading, paragraph reading, and free speech production. Results revealed that the word-list task produced the highest incidence of nativelikeness, as judged by a panel of four native German listeners;
this was followed by sentence reading, paragraph reading, and free speech
production, on which most learners failed to pass for native speaker. Only
one individual among the original 24 advanced learners passed for a native
speaker in all four speech modes. The Moyer study highlights an important problem with those pronunciation studies in which conclusions about
the critical period have been based solely on late L2 learners accent-free
reading of rehearsed words, sentences, or short passages (e.g., Bongaerts,
Mennen, & van der Slik, 2000; Bongaerts, Planken, & Schils, 1995; Bongaerts, van Summeren, Planken, & Schils, 1997; Neufeld, 2001; for overviews,
see Bongaerts, 1999; Long, 2005). The typical result of such studies is that
quite a few participants pass for native speakers when their pronunciation is
judged by a panel of native listeners. However, as these results concern rehearsed reading (sometimes even imitation; see Neufeld, 1979) rather than
free speech production (as in the Moyer study), it is not unlikely that they
may reflect language-like behavior (Long, pp. 297f f ) rather than actual L2
proficiency.2
In a phonetic study of five intermediate and five advanced English-speaking
late L2 learners of Spanish, Colantoni and Steele (2006) investigated one
specific area of phonological acquisition: stop-liquid sequences. Despite this
obvious limit in scope, the study surpassed most other CPH studies in its
high degree of detail and scrutiny. Instead of using native-speaker judges, the
readings of 44 words from each participant were analyzed acoustically with
regard to three phonetic properties of stop-liquid clusters: stop voicing, rhotic
length and voicing, and epenthesis rate and vowel length. Results revealed that
only one of the advanced learners (AO 24 years) and none of the intermediate
learners exhibited truly nativelike behavior (as defined by the analyzed speech
of 10 native control speakers) on all three properties. In a similarly detailed
study, Birdsong (2007) reported on aspects of the pronunciation of 22 late
L1 English learners of L2 French, all of whom had resided in the Paris area
for 11 years on average. It was shown that two participants performed within
the range of 17 native speakers of French on three measures: vowel length,
voice onset time, and global pronunciation, as rated by three native judges.
Although the incidence of nativelikeness was more or less identical in these
two studies (910% of the samples), interestingly enough the conclusions drawn
by the authors diverge significantly: Whereas Colantoni and Steele concluded
that although possible, nativelike attainment of an L2 by adults is clearly
Language Learning 59:2, June 2009, pp. 249306

254

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

exceptional (p. 71), Birdsong described his results as impressive rates of


nativelike pronunciation (p. 112).3
In the area of grammar, Coppieters (1987) found that among 21 apparently
nativelike and highly educated adult learners of French as a foreign language,
none performed within the range of native controls on a syntactic/semantic
judgment task, covering a variety of (UG and non-UG) morphosyntactic constructions. In addition, from analyses of recorded interviews, it was observed
that many of them produced errors in structures that were actually mastered in
the judgment task. However, in a replication of the Coppieters (1987) study,
although using other criteria for participant selection, Birdsong (1992) reported
that no less than 15 of 20 late foreign language learners (AO 1128 years) of
French, all of whom had also resided in France for a minimum of 3 years, performed within the native-speaker range. Similarly, in a UG-oriented study of
the accessibility of Subjacency and the Empty Category Principle (ECP), White
and Genesee (1996) found no difference in test performance between a group
of near-native speakers of L2 English (including 16 individuals with AO
12) and a group of native English participants, although significant differences
were reported between a nonnative group and the native group. Furthermore, no
age effects within the participant groups were observed. The authors concluded
that access to these UG principles is unaffected by age, but they remained noncommittal about other linguistic domains (White & Genesee, p. 262). However,
a problem with this study is that most of the participants were L1 speakers of
French, a language in which Subjacency and the ECP work largely as they do in
English. It is therefore not clear why one should expect near-native participants
to have any particular difficulties with sentences including these features.
In contrast, Montrul and Slabakova (2003) focused on an area known to be
very difficult for L2 learners of Spanishmorphological and semantic properties of aspectual tenses. With the focus set on highly proficient learners
with English as an L1, they investigated three participant groups: 17 nearnatives, 23 superior learners, and 24 advanced learners, all of whom had
begun their Spanish studies in high school (age 12 years).4 Two linguistic tasks were employed: one sentence-conjunction task and one truth-value
judgment task, both of which were reported to be very difficult, even for
native speakers. Yet the results showed that 19 out of the total of 64 L2 participants performed within the range of 20 native control speakers on both
tasks; 12 of these were found in the group of 17 near-natives. Therefore, these
researchers concluded that a nativelike command of the Spanish aspectual system does not become unattainable after a certain age, although, in line with
White and Genesee (1996), acknowledging the possibility of one or several
255

Language Learning 59:2, June 2009, pp. 249306

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

critical periods for other linguistic structures or domains (Montrul & Slabakova,
p. 384).
A recent study by van Boxtel, Bongaerts, and Coppen (2005; based on van
Boxtel, 2005) is another good example of research that focuses on advanced late
learners ability to acquire structures or details of the target language that are
known to be extremely difficult for L2 learners. In this case, the target area was
dummy subject constructions in Dutch, for which no explicitly formulated
rules are available (p. 376). The L2 proficiency of 43 very advanced late
immigrant learners (AO 12 years) with either German, French, or Turkish
as the L1 was scrutinized with two tests: one elicited imitation task and one
sentence preference task. The learners performances were compared to those of
44 native speakers of Dutch. The study produced eight learners (three German,
four French, and one Turkish) who scored within the native-speaker range on
both tasks and on all sentence types. It was concluded that implicit acquisition
(from L2 input alone) of unusual and difficult structures is indeed possible even
for late learners and that results of this kind are unsupportive of the CPH.
Finally, and most relevant for the present study, there are a few studies that
have tried to identify late L2 learners with nativelike competence and behavior across a wide range of tasks, thereby covering as many linguistic domains
as possible. This approach to global nativelikeness was first taken by Ioup,
Boustagui, El Tigi, and Moselle (1994) in their influential study of two exceptional adult learners of Arabic. The two learners, called Julie and Laura, both
had English as an L1 and were chosen for the study because native speakers of
Arabic did not usually notice their nonnative background. At the time of the
study, both learners were residents of Egypt. Julie had no knowledge of Arabic
before moving from Britain to Cairo at the age of 21 years, and she was married
to an Egyptian man, had two children, and spoke only Arabic with her family
and her husbands relatives. Her length of residence in Egypt was 26 years,
and she had learned Arabic exclusively through informal exposure. Laura, on
the other hand, was a native speaker of American English and had received
extensive formal exposure to Arabic at various universities. She was married
to an Egyptian man and had been living in Egypt for 10 years at the time of the
study. Thus, both of these L2 speakers could be thought of as being optimally
immersed into the L2 as well as into the Egyptian society and culture. The
two learners were subjected to a large test battery, consisting of six measures
that covered speech production (free speech judged by a native-speaker panel),
accent identification (two different tests), and grammatical proficiency (translation, grammaticality judgment, and interpretation of anaphora). Although
both Julie and Laura performed extremely well on all these tasks (in fact,
Language Learning 59:2, June 2009, pp. 249306

256

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

Julie performed better than many native speakers on the accent identification
tasks), both performed significantly below the native-speaker range on aspects
of grammatical intuition.
In a recent study of L2 mastery across-the-board, Marinova-Todd (2003)
investigated 30 highly advanced late learners ultimate attainment of English.
The participants were selected on the basis of recommendations by native speakers who found them to be highly proficient in English. The learners had varying
L1 backgrounds and they had arrived in the United States between the ages of 16
and 31 years. Their ages of first exposure to English were 1021 years and they
had a length of residence between 5 and 20 years. A battery consisting of nine
instruments was used, covering four linguistic domains: pronunciation (elicited
and spontaneous speech), lexicon (receptive vocabulary size and productive
lexical diversity), morphosyntax (grammaticality judgment, production, and
sentence comprehension), and language use/pragmatics (politeness strategies
and narrative ability). When compared to the performances of 30 native English
speakers, most L2 participants did not pass for native speakers on all nine measures. However, two individuals did so, and one additional learner performed
within the native-speaker range on all but one measure (vocabulary size). All
three learners arrived in the United States at age 18 years and had a length of
residence of 57 years in the country. Worth pointing out is that these three
late learners had, prior to their arrival, received formal English instruction in
their home countries for 5 years on average, which means that they were about
13 years oldnot adultswhen they actively began to learn English, a fact
that makes them less comparable to, for example, Julie and Laura in the Ioup
et al. (1994) study.
As has become clear from the above review of the research literature,
the reports on the incidence of nativelikeness in late L2 learners vary enormously, from quite high rates (e.g., Birdsong, 1992; Birdsong & Molis, 1998;
Montrul & Slabakova, 2003; Van Wuijtswinkel, 1994; White & Genesee,
1996), through more moderate rates (e.g., Birdsong, 2007; Bongaerts, 1999;
Colantoni & Steele, 2006; Flege et al., 1999 [for grammar]; Marinova-Todd,
2003; Moyer, 1999; van Boxtel et al., 2005), to zero occurrences (e.g., Bialystok & Miller, 1999; Coppieters, 1989; Flege et al. [for accent]; Ioup et al.,
1994; Johnson & Newport, 1989). In our own empirical studies of very advanced L2 speakers, in which we have tried to adopt stringent elicitation
methods and techniques of analysis, we have consistently failed to identify
nativelike late learners of L2 Swedish. So far, we have interpreted this as
support for our claim that nativelike L2 learners with an AO of acquisition beyond puberty are extremely difficult, or even impossible, to find (see
257

Language Learning 59:2, June 2009, pp. 249306

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

Abrahamsson & Hyltenstam, 2008; Abrahamsson, Stolten, & Hyltenstam, in


press; Hyltenstam, 1992; Hyltenstam & Abrahamsson, 2003a; see also Stolten,
2005, 2006). Here, as in a few other studies, the question of the actual ultimate attainment resulting from childhood learning has also been highlighted;
in fact, it now seems clear that differences do exist even between early learners
ultimate attainment and native-speaker proficiency (cf. the results reported by
Bialystok & Miller, 1999; Butler, 2000; Flege, Munro, & MacKay, 1995; Flege
et al., 1999; Lee, Guion, & Harada, 2006; McDonald, 2000; Tsukada, Birdsong,
Bialystok, Mack, Sung, & Flege, 2005; for a discussion, see Hyltenstam &
Abrahamsson, 2003b). Obviously, if very short delays in the onset of acquisition can be shown to have effects on the ultimate level of L2 proficiency, this will
have important implications for the CPH or any other theory of maturational
constraints in SLA.
The Present Study
A central point of departure for the study to be described here is that as
long as there are no accounts of a single adult learner who, in all relevant
respects, can be shown to have attained proficiency in an L2 that is identical
to a native speakers, there can be no claims for the existence of such learners
(Hyltenstam & Abrahamsson, 2003b; cf. also Long, 1990, 1993). Therefore,
rather than investigating the ultimate attainment of a representative sample of
the L2 learner population, the present study aimed at identifying individuals
who would potentially constitute the evidence necessary to reject the CPH. In
other words, the study positions itself among those previous studies that have
focused exclusively on learners who (at least) seem to have attained a nativelike
level of L2 proficiency.
The present study was conducted in two consecutive parts, referred to here
as Part I and Part II, respectively. Part I concerned native-listener perception
of nativelikeness of a large pool (n = 195) of very advanced L2 speakers of
Swedish with AOs of L2 acquisition between <1 and 47 years. In addition, the
native-listener judgments resulting from this part of the study also served as a
screening that formed the basis for participant selection for the second part.
Part II consisted of a detailed linguistic scrutiny of nativelikeness of a subset
(n = 41) of L2 speakers with AOs 119 years, all of whom had passed for native
speakers of Swedish with a majority of the native listeners in Part I.
As described in greater detail below, the study aimed at incorporating
three important methodological features that (with a few exceptions) have been
lacking in CPH-related research: (a) a specified understanding of the concept
Language Learning 59:2, June 2009, pp. 249306

258

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

of nativelikeness; (b) initial screening of participants; and (c) an in-depth


scrutiny of actual linguistic nativelikeness. The lack of these features in earlier
studies has contributed significantly to what we see as an overestimation of the
incidence of nativelikeness.
The first feature concerns the way in which the concept of nativelikeness
can be understood. In order to address this question in greater detail, we distinguish three different ways in which the concept of nativelikeness has been
interpreted:
Interpretation 1: to self-identify as a nativelike speaker of the target language
(e.g., Piller, 2002; Seliger, 1978; Seliger et al., 1975)
Interpretation 2: to be perceived as a nativelike speaker by native speakers
of the target language (e.g., Bongaerts, 1999; Moyer, 1999;
Neufeld, 2001)
Interpretation 3: to be a nativelike speaker of the target language (e.g.,
Birdsong, 1999; Bley-Vroman, 1989; Long, 1990)
It is, of course, difficult, or even impossible, for L2 users to judge for themselves
whether they pass for native speakers (i.e., Interpretation 1), a fact dealt with
only in passing in this article. Although we acknowledge the psychosocial reality of self-identification as a nativelike speaker (i.e., regardless of what native
speakers think or of what a linguistic analysis would reveal), we believe that
this interpretation of nativelikeness may be safely disregarded in the following
discussion, primarily because it clearly falls outside the scope of what is usually
meant by nativelikeness or native speaker in scholarly discourse. Whether
someone is perceived as a native speaker by (actual) native speakers (Interpretation 2), on the other hand, is a central aspect of nativelikeness. These speakers
are part of a language community in which they are joined by a reciprocal identification as members of this community based on linguistic characteristicsa
clearly sociolinguistic issue. The question of whether someone is linguistically
like a native speaker (Interpretation 3) constitutes, in our view, a basically psycholinguistic problem, although it may also be viewed from other perspectives
(i.e., social, pragmatic, etc.).
In our current research, we are primarily interested in Interpretation 3 of
nativelikenessthat is, the extent to which L2 learners exist who are nativelike in their language competence and behavior (see, e.g., Abrahamsson &
Hyltenstam, 2008; Abrahamsson et al., in press; Hyltenstam & Abrahamsson,
2003a). However, the study reported in this article covers all three interpretations of nativelikeness in the following manner. First, all participants included
in Part I of the study are highly advanced L2 learners, all of whom identify
259

Language Learning 59:2, June 2009, pp. 249306

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

themselves as potentially nativelike or near-native speakers of Swedish. Furthermore, Part I of the study investigates the number of these advanced learners
who are perceived as nativelike speakers by native listeners. Finally, Part II of
the study investigates how many of these perceived nativelike speakers, in fact,
behave linguistically like native speakers of Swedish on a broad selection of
language proficiency measures and tasks.
The second methodological feature that this study tried to capture is initial
screening of participants. Because the research agenda advanced by the study
concerns the incidence of actual, linguistic nativelikeness, the selected participants need to be highly advanced L2 speakers. As correctly pointed out by
Long (1993), [t]here is no value in studying obviously non-native-like individuals intensively in order to declare them non-native-like (p. 204); therefore,
screening participants for potential nativelikeness is a crucial procedure prior
to language testing. In previous research, such procedures have been adopted in
one of two ways. The first is informal screening, in which recruitment of participants comes about through impressionistic judgments by teachers or school
administrators (Bongaerts, 1999; Hyltenstam, 1992; Moyer, 1999; van Boxtel
et al., 2005) and/or the researchers themselves (e.g., Colantoni & Steele, 2006;
Hyltenstam & Abrahamsson, 2003a; Marinova-Todd, 2003; Neufeld, 1979),
most often as a result of word-of-mouth and friends of friends networking.
The other way to select participants is through more formal screening procedures, in which expert judges or larger panels of linguistically nave native
listeners make judgments of recorded speech samples, usually from a large
pool of candidates identified through, for example, newspaper advertisements
or posters on university campuses (e.g., White & Genesee, 1996). As mentioned
earlier, the sessions with native listeners in Part I of the present study functioned
as an extensive and careful formal screening procedure for participant selection
for Part II.
The third methodological feature that this study tried to incorporate is the
in-depth scrutiny of actual, linguistic nativelikeness. Because we are obviously
dealing with very advanced, seemingly nativelike individuals, our primary
challenge is to avoid Type II errorsthat is, claims of nativelikeness for L2
speakers whose linguistic knowledge and behavior ought to be described as
near-native rather than nativelike. One way of doing this is to avoid ceiling
effects. This calls for testing procedures in which linguistic measurement is
characterized by a sufficient degree of scrutiny: The tests and tasks should be
demanding, and the linguistic analyses should be made in great detail and with
extreme care. Another way to avoid Type II errors and false positives is to
include diverse measures of nativelikeness, representing different aspects and
Language Learning 59:2, June 2009, pp. 249306

260

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

levels of linguistic proficiency, rather than limit the scope of inquiry to one or a
few linguistic domains. One distinction that is sometimes made concerning the
relevance of the critical period and maturational constraints is the one between
phonology and grammar. Some researchers (e.g., Scovel, 1988) maintain that
because of the physiological and muscular basis of articulation, a critical period
can be expected only for the ultimate attainment of pronunciation but not
necessarily for higher order linguistic phenomena, such as morphology or
syntax. However, still others claim that only morphosyntactic features specified
by UG, as opposed to non-UG features, constitute the relevant domain of
research on nativelikeness and the critical period, and most of these researchers
would agree that core UG features, as opposed to peripheral ones, should be
the focus of attention (e.g., Eubank & Gregg, 1999). In contrast, there are
researchers who suggest that the focus should be on aspects of the L2 or of
L2 acquisition that are known to be difficult for learners; for example, Long
(1993), suggests that one focus should be on unusual structures.
As we see it, the scope of research on nativelikeness must not be limited to
any specific aspect of the L2. Rather, studies need to include measurements of
various kinds of L2 features, including all linguistic levels (phonology, grammar, lexis, etc.), skills, processing, automaticity, as well as both production and
perception. Research conducted by Sorace (see, e.g., 1993, 2003) offers evidence that a fruitful area of investigation should be the way in which L2 speakers
may diverge in their grammatical (and lexical) choices, without therefore exhibiting overt errors. Her studies show that near-native L2 speakers frequently
diverge from native speakers, although their performance/competence may well
be in accordance with formal target-language norms or UG constraints. Thus,
the subtle differences between native-speaker and near-native-speaker competence must be searched for alsoor even especiallybeyond pronunciation and
outside the UG domain. Furthermore, Birdsong (2006) suggested that where
nativelikeness is perhaps least likely to be observed is in certain domains of
language processing (p. 21), and in a similar vein, Liu (2006) contended that
despite recent advances in research on successful L2 users end-state competence, much remains unknown about their end-state processing ability in the
L2 (p. 2).
In order to produce a representative across-the-board measurement of
nativelikeness, including aspects of language processing, the present study
employed 10 instruments for L2 scrutiny, covering phonetic production and
perception (voice onset time), perception of words and sentences in white noise
and babble noise, grammaticality judgments (written and auditory test modes
with latency times), grammatical, lexical, and semantic inferencing (a cloze
261

Language Learning 59:2, June 2009, pp. 249306

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

test), and formulaic language (tests of idiomatic expressions and proverbs) (see
below for further details, Part II of the study).
The questions that guided the study were the following: (a) Do late (i.e.,
adolescent and adult) L2 learners exist who are perceived as native speakers?
(Part I); (b) Are most early (i.e., child) L2 learners ultimately perceived as
native speakers? (Part I); (c) Do late (i.e., adolescent and adult) L2 learners
exist who are nativelike when scrutinized in detail? (Part II); and (d) Are most
early (i.e., child) L2 learners ultimately nativelike when scrutinized in detail?
(Part II).
Part I: Perceived Nativelikeness
Method
Participants
During the period from September 2002 to March 2004, we identified a total of
195 L2 speakers of Swedish (132 females and 63 males) who had begun their
acquisition at various ages and who identified themselves as advanced and
potentially nativelike L2 speakers. They were identified through three large
advertisements in daily newspapers5 (see Appendix A) and a poster campaign
at nearly all universities and colleges in the Stockholm area in which we
encouraged people to call us on the telephone if they believed that their nonSwedish background was usually not noticed by native speakers of Swedish.
To qualify as participants for the study, respondents had to meet six criteria
mentioned in the advertisement. They had to (a) have Spanish as their L1, (b)
speak Swedish fluently without a foreign accent or any obvious grammatical
deviations, (c) be 19 years of age or older, (d) have lived in Sweden for 10 years
or more, (e) have an educational level of no less than senior high school (i.e.,
minimally 12 years of schooling), and (f ) have primarily been exposed to and
acquired the variety of Swedish spoken in the greater Stockholm area (for
an English translation of the advertisement, see Appendix A). Because the
candidates responding to the first advertisement were strongly biased toward
lower AOs of acquisition, the second advertisement addressed only those with
AOs above 7 years, and the third advertisement addressed only those with AOs
above 10 years. In all other respects, the advertisements were identical. The first
advertisement resulted in 135 respondents, the second advertisement resulted
in 50 respondents, and the third advertisement and the poster campaign together
resulted in 10 respondents; that is, there were 195 in total.
In the following analyses, we will distinguish between the learner categories
AO 11 years and AO 12 years because these may be thought of as
representing L2 learning before and after the closure of a critical period, which
Language Learning 59:2, June 2009, pp. 249306

262

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

Table 1 Background information (independent variables) on the 195 respondents; comparisons between respondents with age of onset (AO) 11 and 12 years (df = 193)
AO 11
(n = 107)
Independent variable
AGE (years)
LOR (years)
L2 EXP (years)
L1 USE (%)
SEX (% f/m)

SD

28.6
7.2
23.1
7.5
22.4
7.4
27.4
17.5
69/31

AO 12
(n = 88)
M

SD

41.5
9.1
21.2
7.3
20.9
7.3
30.9
18.1
61/39

t-test (two tailed)


t

11.1
1.82
1.41
1.34
0.23

<.0001
.071, ns
>.1, ns
>.1, ns
>.1, ns

chi-square test, 2 (1, 195) = 0.23, p > .1.

traditionally has been associated with the onset of puberty (approximately age
12 years; Lenneberg, 1967). Although puberty has been questioned as a valid
upper limit for a critical period for language acquisition (not least by ourselves;
see, e.g., Hyltenstam & Abrahamsson, 2003b), we still find a distinction based
on a theoretically established hypothesis to be more valuable than a distinction
based on theoretically arbitrary grounds, such as AO 15 and AO 16 (cf.,
e.g., Bialystok & Miller, 1999; Birdsong & Molis, 2001; Johnson & Newport,
1989; Patkowski, 1980). Furthermore, age 12 years is a reoccurring cutoff
point that has been used or explicitly explored in previous studies (see, e.g.,
Bongaerts, 1999; Cranshaw, 1997; Flege et al., 1999; McDonald, 2006; Montrul
& Slabakova, 2003; van Boxtel et al., 2005; Van Wuijtswinkel, 1994; White &
Genesee, 1996), which justifies further the present division into early (AO
11) and late (AO 12) learners.
Of the 195 participants, 107 began their acquisition of Swedish before age
12 years6 and 88 began to learn Swedish at the age of 12 years or later. For most
of the participants, AO of acquisition coincides with age at immigration. A
comparison of background variables of the two AO groups is shown in Table 1.
The mean chronological age (i.e., age at the time of the study) was 2829 years
in the early-learner group and 4142 years in the late-learner group, a difference
that is statistically significant. In all other respects, however, the two groups
are fully comparable: There are no significant differences concerning length of
residence (LOR) in Sweden, amount of L2 exposure (operationalized as number
of years in Sweden minus number of years spent outside the L2 environment
since the time of immigration), frequency of L1 use (operationalized as the
informants self-reported daily use of Spanish, expressed in percentages), or
distribution of women versus men. That the groups differ in chronological age
263

Language Learning 59:2, June 2009, pp. 249306

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

is rather unproblematic because there are no theoretical reasons to believe that


a somewhat higher age as such would have an impactin any directionon
nativelikeness (cf. results in MacKay, Flege, & Imai, 2006; for a discussion of
the age-length-onset problem, that is, that age of onset logically is confounded
with length of residence as well as with chronological age, see Stevens, 2006).
Interview and Speech Elicitation
Our first encounter with the potential participants was by telephone. In response
to the newspaper advertisements, the 195 candidates called a project assistant
and went through a 15-min interview, which, with the respondents consent,
was recorded on a SONY TC-D5M cassette recorder. The interview generated
the background data given in Table 1 as well as information about knowledge of
languages other than Spanish and Swedish, any residency outside the Stockholm
area since the time of immigration, formal instruction in Swedish as an L2,
mother-tongue instruction, any known hearing impairment, and any history of
dyslexia.
At the end of each interview, samples of more or less spontaneous speech
were elicited, which would later serve as stimuli in the listening sessions with
native speakers of Swedish (see below). The participants were asked to talk
freely for 1 min on a certain subject that anyone living in Sweden can relate to,
namely Astrid Lindgren, the most famous Swedish author of childrens stories
and books (e.g., Pippi Longstocking).
Speech samples were also elicited over the telephone from 20 native speakers of Swedish, 10 females and 10 males, who had a mean age of 28 years
(range: 2340). Of these, 10 had grown up in the Stockholm area and 10 had
migrated to Stockholm from other parts of Sweden; these latter native participants had, however, lived in the Stockholm area for many years and exhibited
only very subtle dialectal features in their speech. (We will return later to the
reasons for including dialectal variation in the material.) The notion native
speaker of Swedish was operationalized in this study as someone who (a) has
spoken only Swedish at home during childhood, (b) has had Swedish as the only
language of instruction at school, and (c) has lived his or her whole life in a context in which Swedish has been the majority language. Pure monolingualism
was not a requirement.
Preparation of Speech Stimuli
The first 2030 s of the 1-min speech samples were extracted and used as stimuli
in three separate listening sessions with native judgesone session after each
newspaper advertisement (see Procedure section). The recordings were only
Language Learning 59:2, June 2009, pp. 249306

264

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

minimally edited. In those cases in which the content of a speech sample


revealed the non-Swedish origin of the speaker, this particular information
was cut out (e.g., Back home in Chile, we used to . . . or When we moved
to Sweden . . .). Very long pauses were also edited out. However, the final
duration of the speech samples was always between 20 and 30 s, which is a
sample length that has been demonstrated to be sufficient in speech-judgment
tasks involving linguistically nave listeners (e.g., Cunningham-Andersson &
Engstrand, 1989; see also Flege, 1984).
Native Judges
For each of the three listening sessions, 10 different native speakers of Stockholm Swedish were engaged as judges. These were recruited among students
at Stockholm University but were linguistically and phonetically nave and had
no knowledge of Spanish. All of the 30 judges (15 females and 15 males)
had grown up in the Stockholm area and had a mean age of 25 years (range:
2130).
The reason for engaging a new panel of native judges for each of the three
sessions was twofold. First, it was practically impossible to reassemble the
original panel 7 and 18 months after the first session. Second, because the
judges were informed about the actual purpose and design of the research after
the session, it was necessary to engage a new, objective panel for each session.
Procedure
The listening sessions were run within a few months after each advertisement at
a point when new candidates no longer called us on the telephone. The sessions
were lead by a male native speaker of Stockholm Swedish.
The first listening session included speech samples from the 135 candidates
who responded to the first advertisement as well as speech samples from the
20 native speakers of Swedish. The session took 90 min and the judges were
paid SEK 150 immediately after the session. The second listening session
included speech samples from the 50 candidates who responded to the second
advertisement plus samples from 8 of the native speakers (4 from Stockholm, 4
with subtle regional features). This session took 45 min and these judges were
paid SEK 100. The third listening session included speech samples from the 10
candidates who responded to the third advertisement and the poster campaign
as well as samples from 8 native speakers (not the same individuals as in
session 2); furthermore, in order to make this third session more comparable
to the previous two sessions with regard to length and content, the speech
material was supplemented by 40 randomly selected L2 speech samples from
265

Language Learning 59:2, June 2009, pp. 249306

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

the second session. In addition, using 40 samples in two independent listening


sessions with different judges provided a good opportunity to obtain a measure
of interrater reliability (see below). The duration of the third session and the
payment was the same as for session 2.
Judges were told that the research project concerned peoples ability to differentiate Stockholm pronunciation from regional dialects and foreign accents.
This distracting information was given in order to prevent the judges from
focusing solely on foreign accents and to make the task resemble an authentic, everyday speaker-judgment situation, in which many sources of phonetic
variation may come to peoples minds. In addition to the influence of social,
pathological, and personality factors, very mild and occasional deviances in
advanced L2 learners speech are frequently interpreted by native listeners as
a consequence of regional variation rather than a nonnative background (cf.
Markham, 1997).7 Limiting the task to discrimination between native and nonnative speech, with no opportunity given to reflect on alternative origins of
phonetic variation, would thus prompt judges to interpret any kind of deviance
as a sign of nonnativeness, which is why the dialect dimension was also included
as a possibility. (For a similar method for capturing any possible confusion between regional accent/dialect and minor foreign accent in very advanced L2
speakers, see Marinova-Todd, 2003, p. 62.) After the session, the judges were
given correct information about the purpose of the study as well as about the
actual speaker distribution.
Instructions were given both orally by the session leader and in writing
on the computer screen; thereafter, the judges were encouraged to ask for
clarifications if needed. The judges were not instructed to focus on any linguistic
feature in particular, such as pronunciation, morphosyntax, or lexical choice
but rather to aim for an overall impression of each sample in order to judge each
speakers status as native/nonnative speaker of Swedish (for a similar approach,
see Montrul & Slabakova, 2003, pp. 367368).
The listening sessions took place with each judge alone in a sound-treated
room. The task was designed and run with the computer software E-Prime.8
Speech samples were presented in different random orders for each judge
through KOSS KTX/PRO earphones. Since the telephone recordings varied to
some extent in sound quality, the judges were able to adjust the volume at any
time during the session. During each sample, the following three alternatives
were presented on the screen:
(A) This persons mother tongue is Swedish and he/she is native to the
Stockholm area
Language Learning 59:2, June 2009, pp. 249306

266

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

(B) This persons mother tongue is Swedish but he/she is not native to the
Stockholm area
(C) This persons mother tongue is not Swedish
Three keys on the computer keyboard were marked with A, B, and C and the
judges could make their choice at any time during each speech sample. That
is, after only a few seconds if they had decided upon a certain alternative, they
could interrupt the sample presentation by pressing the corresponding button.
After judging a speech sample, the judges were instructed to hit the space key to
generate a new sample or to take a short break; thus, the judges performed the
task at their own pace with no time limit. Each sample was, however, presented
only once; judges could not go back and reconsider a sample.9
Analysis
The distinction from Stockholm/not from Stockholm was not included in the
analysis; instead, alternatives A and B were both regarded as an indication that
a participant was perceived as a mother-tongue speaker of Swedish, with or
without subtle (perhaps even nonidentifiable) dialectal features. In other words,
alternative C alone represented the judgment does not pass for native speaker.
The reason for using a method based on binary alternatives, rather than
scalar alternatives in the form of a 15 or 19 scale (widely used particularly
in foreign accent studies; see, e.g., Flege et al., 1999; Munro & Mann, 2005),
was that nativelikeness (unlike, e.g., foreign accent), by definition, is a
binary phenomenon similar to, for example, marriedness and deadness.
Thus, our intention was not to have each native listener rate how nativelike
the participants were but rather to investigate whether some L2 speakers are,
in fact, interpreted as native speakers of Swedish.
Nevertheless, in order to operationalize and quantify the collective perception of the 10 native listeners, their judgments were transformed for each
speaker into scores of perceived nativelikeness. which we will here refer to
as PN scores.10 Thus, a PN score corresponds to the number of judges who
chose alternative A or B. For example, a speakers PN score of 8 means that
8 out of the 10 native judges believed that this speaker is a native speaker
of Swedishagain, with or without what the judges may have interpreted as
subtle dialectal features.
Interrater Reliability
In order to obtain a measure of interrater reliability, here in terms of interpanel
agreement, the judgments from sessions 2 and 3 (i.e., from two independent
267

Language Learning 59:2, June 2009, pp. 249306

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

listener panels) of the overlapping 40 L2 speakers were compared. The correlation between the two sets of judgments was exceedingly high, r = .97, df =
38, p < .001, which indicates that the choice between engaging one panel of
judges or several different panels is of less concern in studies of this kind (cf.
also Cunningham-Andersson & Engstrand, 1989).
Results
A plot of all individual PN scores for the 195 L2 participants and the 20 nativespeaker participants is shown in Figure 1. The figures in the plot express,
for each AO, the number of participants whom the judges believed were native
speakers of Swedish. Let us begin by establishing that there was nearly absolute
agreement among the judges concerning the 20 native speakers. As many as 18
of the natives were judged by all 10 judges as being mother-tongue speakers
of Swedish, whereas only 2 of the judges chose alternative C for one native
speaker each.11
However, as is evident from Figure 1, the judgments are more varied concerning the L2 speakers, and a negative correlation between AO and PN score
can easily be observed by eye. Table 2 presents a comparison of the mean PN
scores for the native speakers, the early L2 learners (AO 11 years), and the late
L2 learners (AO 12 years). The native speakers received a mean PN score of
9.9 [i.e., (18 10) + (2 9)/20 = 9.9], the early L2 learners received a score
of 7.9, and the late learners received a score of 2.5. As shown in Table 2,
all differences are statistically significant [one-way ANOVA: F(2, 215) =
111.61, p < .0001; comparisons of adjacent groups with Fishers Protected
LSD post-hoc test12 ]. Age of onset of acquisition is the variable most strongly
associated with perceived nativelikeness, r = .72, df = 193, p < 001, and
can therefore explain more than half of the variation: r2 = .52. Each of the
other variables (see Table 1) explains only about 28% of the variation: r2 =
.024.076. In other words, AO appears to be the best predictor of perceived
nativelikeness.
In Figure 1, the 195 L2 participants have been divided into five smaller AO
groups: early childhood (AO 5 years, n = 53), late childhood (AO 611 years,
n = 54), adolescence (AO 1217 years, n = 31), early adulthood (AO 1823
years, n = 33), and later adulthood (AO 24 years, n = 24). These 6-year
intervals are motivated partially by general phases in language development,
and on closer examination these divisions are, in fact, reflected in the general
pattern in Figure 1. What the two lower AO groups (i.e., early childhood and
late childhood) have in common is that a majority of the participants are
perceived as mother-tongue speakers of Swedish by most of the judges; as with
Language Learning 59:2, June 2009, pp. 249306

268

Figure 1 Scatter plot of PN scores versus AO for all 195 participants and the 20 native controls (AO 0 years).

Abrahamsson and Hyltenstam

269

AO and Nativelikeness in an L2

Language Learning 59:2, June 2009, pp. 249306

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

Table 2 Group comparisons of mean PN scores for the native speakers (NS), the AO
11 learners, and the AO 12 learners with Fishers protected LSD post hoc test based
on ANOVA: F(2, 215) = 111.61, p < .0001
Participant
group 1
NS
AO 11

SD

Participant
group 2

SD

Fishers LSD

20
107

9.9
7.9

0.3
2.9

AO 11
AO 12

107
88

7.9
2.5

2.9
3.0

p = .005
p < .001

the native speakers, most L2 speakers in these two groups have received a PN
score of 9 or 10. However, these groups differ concerning the distribution of
lower PN scores; for example, there is no participant with PN score 0 in the
early childhood group, and only a few have received a PN score less than 6.
On the other hand, no participants in the two highest AO groups (i.e., early
adulthood and later adulthood) were judged as mother-tongue speakers of
Swedish to the same degree as the 20 native speakers (i.e., PN score 910);
rather, in these two groups, the ratings have been concentrated around PN
scores 0 and 1. At the same time, there is a larger number of participants in
the early adulthood group that received a PN score higher than 1 than is the
case with the later adulthood group, in which all participants but one received
a PN score of 1 or 0. A clearly higher degree of variation is found in the middle
group (i.e., adolescence), with no concentration of PN scores at either end of
the scale. At these AOs (1217 years), there are some (five participants) who
are perceived as mother-tongue speakers of Swedish by all or all but one of
the judges, and some (six participants) who are perceived as mother-tongue
speakers by one or none of the judges; the remaining 20 participants are fairly
equally distributed along the scale (with one to five participants at each PN
score).
Figure 2 provides a clearer illustration of the relation between the five
smaller AO groups and the relation between these learner groups and the
native-speaker group. A one-way ANOVA test reveals that there are significant
differences between the groups, F(5, 215) = 67.40, p < .0001. As shown in
Table 3, Fishers protected LSD post hoc test reveals that the main differences
can be found between the native group and all other groupsincluding the
earliest learner group13 and between the adolescence group and all other
groups. However, neither the difference between the two childhood groups nor
the one between the two adulthood groups reached significance, which indicates
that the major changes in eventual perceived nativelikeness of L2 learners can
be associated with adolescence.14
Language Learning 59:2, June 2009, pp. 249306

270

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

10

n = 20

Score of Perceived Nativelikeness

n = 53
n = 54

8
7
6

n = 31
5
4
3
2

n = 33

n = 24

0
Native
speakers

Early
childhood
AO <1-5

Late
Adolescence
childhood
AO 12-17
AO 6-11

Early
adulthood
AO 18-23

Later
adulthood
AO 24-47

Age of onset group

Figure 2 Average PN scores for all 195 participants, divided into five AO categories,
and the 20 native controls.

As we have seen in Figure 1, most early learners (AO 11 years) are


perceived as mother-tongue speakers of Swedish, whereas a majority of the
late learners (AO 12 years) are not. In fact, most of the adult learners (i.e.,
AO 18 years) are perceived as native speakers by either one or none of
the judges, and this is particularly true of those with AO 24 years. This
pattern is summarized in Table 4. We can see from the right column of Table
4 that when the whole group of L2 speakers is taken into account (i.e., all of
the 195 participants), approximately one third of the participants had received
a PN score of 910 (i.e., they were perceived as native speakers by 9 or 10
of the native listeners, a level that corresponds to the judgment of the native
speakers), one third received a PN score of 01 (i.e., they were perceived as
native speakers by only 1 or none of the judges), whereas the remaining third
received judgments somewhere in between (i.e., PN score 28). If we then
271

Language Learning 59:2, June 2009, pp. 249306

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

Table 3 Group comparisons of mean PN scores for the native speakers and the five
adjacent learner groups (AO <15, 611, 1217, 1823, and 2447 years) with Fishers
protected LSD post hoc test based on ANOVA: F(5, 215) = 67.40, p < .0001
Participant
group 1

SD

Participant
group 2

SD

Fishers LSD

Native speakers
Early childhood
Late childhood
Adolescence
Early adulthood

20
53
54
31
33

9.9
8.3
7.6
5.1
1.6

0.3
2.4
3.3
3.2
2.0

Early childhood
Late childhood
Adolescence
Early adulthood
Later adulthood

53
54
31
33
24

8.3
7.6
5.1
1.6
0.4

2.4
3.3
3.2
2.0
0.7

p < .02
p = .136, ns
p < .001
p < .001
p = .087, ns

Table 4 Number and percentage of the participants with the two highest (910) and the
two lowest (01) PN scores and of the participants with PN scores in between (28)

PN score
910
28
01

Native speakers
(N = 20)

AO 11
(n = 107)

AO 12
(n = 88)

All L2 participants
(N = 195)

20 (100%)

66 (62%)
35 (32%)
6 (6%)

5 (6%)
32 (36%)
51 (58%)

71 (36%)
67 (35%)
57 (29%)

examine the distribution within the two AO groups, we see that approximately
one third in both groups are in fact perceived as native speakers by two to
eight judges. However, for the highest and lowest distributions of PN scores,
the pattern that emerges is entirely the opposite: 62% of the early learners pass
for native speakers of Swedish (PN score 910) whereas 6% are absolutely
not perceived as native speakers (PN score 01); in contrast, among the late
learners, 6% pass for native speakers with 910 of the judges whereas 58% are
perceived as native speakers by only 1 or none of the judges.
Summary
Perceived nativelikeness was investigated in a sample (n = 195) of the population of advanced early and late L2 learners who perceive themselves as
potentially near-native or even nativelike speakers of Swedish. Their ages of
onset were <147 years. Among the native Swedish control participants, 18
out of 20 (or 90%) were perceived as native speakers by all 10 native judges,
whereas the remaining two were perceived as native speakers by 9 of the 10
judges. Of the 107 early L2 learners (AO 11 years), 62% were perceived
as native speakers by 9 or 10 judges, whereas only three were perceived as
Language Learning 59:2, June 2009, pp. 249306

272

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

nonnative speakers by all 10 judges. In contrast, of the 88 late learners (AO


1247 years), 6% were perceived as native speakers by 9 or 10 judges (and this
result was limited to AOs 1217 years), whereas 36 (or 41%) were perceived
as nonnative speakers by all 10 native judges.
It is important to stress that there were relatively few participants (in total,
39 of 195) who were not perceived as native speakers by any of the judges. This
indicates that the majority of the participants in fact did pass for native speakers
of Swedish with at least one (and often several) of the judges. For example, as
can be seen in Figure 1, five participants in the early adulthood group actually
passed for native speakers by as many as 5 or 6 judges; similarly, in the later
adulthood group, there is one individual with AO 30 years who convinced 3
of the 10 native judges that he or she was a native speaker of Swedish. This
gives us a clear indication of the generally advanced proficiency level of most
of the participants, despite the fact that they were, in this study, classified as
nonnativelike when having received a PN score lower than 9. Our very strict
criterion of nativelikeness is based on how the 20 native speakers were judged,
which turned out to be at least 9 of the 10 judges identifying a participant
as a mother-tongue speaker of Swedish. If instead we had used an arbitrary
or more liberal nativelikeness criterionsay, when at least half (or why not
one?) of the native judges chose the alternative This persons mother tongue
is Swedish. . .we see from Figure 1 that a much greater number of the
participants would have passed for native speaker, even among those with AO
1223 years.
Part II: Scrutinized Nativelikeness
Method
Selection of Participants
As mentioned earlier, the listening sessions in Part I of the study served as
a formal screening procedure for participant selection in Part II. Our original intention was to include only those L2 speakers who were judged to be
mother-tongue speakers by the listeners to the same extent as the 20 Swedish
participantsin other words, only those L2 learners whose casual, everyday
speech was indistinguishable from that of native speakers. However, because
only five participants with AO at or beyond 12 years and no participant with
AO beyond 17 years passed for native speakers, it was decided that the criterion
level for participant selection in Part II should be somewhat adjusted. A more
liberal definition of perceived nativelikeness was therefore adopted and it was
decided that potential candidates for inclusion in Part II should be those with
PN scores 6; that is, those L2 speakers who had passed for mother-tongue
273

Language Learning 59:2, June 2009, pp. 249306

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

speaker of Swedish with a majority of the 10 native judges. This criterion


resulted in no less than 104 potential participants (i.e., 55%) out of the original 195 candidates, although there was still a strong bias toward lower AOs:
Whereas 87 (81%) passed for native speakers out of the candidates with AO <
111 years, only 17 (19%) individuals did so among the candidates with AO
1247 years.
From these 104 potential candidates, 41 individuals (32 females and 9
males) were selected who met most of the background criteria. Our original
intention was to be able to include 60 participants, evenly distributed across an
AO span between 1 and 20+ years (i.e., with three participants for each AO)
and carefully matched against each other for background factors such as age,
sex, and frequency of daily L1 use. However, such selection procedures were
possible only among the candidates with AO < 111 years because nativelikeness (in terms of PN score) was clearly biased toward lower AOs, and several
gaps occur in the upper half of the AO continuum (1220+ years), with no
participants selected with AO 12, 18, or 20+ years. Of the 17 late learners who
passed for native speakers according to the PN score 6 criterion in Part I of
the study, only the 10 individuals who met the most crucial background criteria
could be selected for participation in Part II. The reasons for excluding no less
than 7 of the 17 late-learner candidates were the following. One candidate (AO:
23; PN score: 6) proved to be an L2 speaker of Spanish with Basque as the L1.
By her account, she began to learn Spanish when she entered the Spanish school
system at the age of 5 years. Another candidate (AO: 17; PN score: 9), who was
initially selected, later declined to participate in the project. Two candidates
(AO: both 12; PN scores: 8 and 9) were excluded because they had lived in
non-Spanish-speaking countries (Romania and East Germany, respectively) for
78 years prior to moving to Sweden. One candidate (AO: 13; PN score: 10)
reported that although she grew up in a Spanish-speaking country, her parents
were actually native speakers of English. The two remaining candidates (AO:
both 14; PN scores: both 6) were excluded because the quota for their AO had
already been filled; the three candidates who were actually chosen as representatives of AO 14 years were those with the highest nativelikeness ratings (PN
scores: 9, 9, and 8).
The mean age for the 41 selected L2 speakers at the time of testing was
32 years (range: 2050). Their mean LOR in Sweden was 25 years (range:
1242) and their mean length of L2 exposure was 24 years (range: 12
42). As shown in Table 5, all differences, except for chronological age, between the early and late learners are statistically nonsignificant, which suggests that the two groups are satisfactorily comparable as far as background
Language Learning 59:2, June 2009, pp. 249306

274

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

Table 5 Background information (independent variables) on the 41 selected participants; comparisons between participants with age of onset (AO) 11 and 13 years
(df = 39)
AO 11
(n = 31)
Independent variable
AGE (years)
LOR (years)
L2 EXP (years)
L1 USE (%)
SEX (% f/m)

M
30.9
25.5
25.2
23.5

SD
6.6
7.1
6.9
13.4
71/29

AO 13
(n = 10)
M
37.2
22.3
22.2
31.5

SD
6.9
6.0
5.9
12.9
100/0

t-test (two tailed)


t

2.60
1.31
1.22
1.65
1.97

<.02
>.1, ns
>.1, ns
>.1, ns
>.05, ns

variables are concerned, even after screening and after participant


selection.
As for country of origin and Spanish variety, there was a strong bias toward
Chilean Spanish (27 participants) because Chileans are, by far, the largest
group among Spanish L1 speakers in Sweden. The other countries of origin
represented among the participants were Peru (six participants), Colombia (two
participants), Spain (two participants), Argentina (one participant), Bolivia (one
participant), Mexico (one participant), and Uruguay (one participant).
A group of native-speaker participants was included, consisting of 15
mother-tongue speakers of Swedish. These were selected on the basis of the
same definition and operationalization of native speaker as in the recruitment
of native participants in Part I. This group was matched with the L2 speaker
group regarding the skewed sex distribution (11 females and 4 males), educational level (senior high school diploma at a minimum), variety of Swedish
(Stockholm), and age (M: 30 years; range: 2346). None of the native speakers had any experience in the phonetic or linguistic sciences or any academic
training in Swedish or other Scandinavian languages.
Procedure
The whole testing procedure took place in a sound-treated room at Stockholm
University with each participant individually. The session lasted for approximately 4 hr and was divided into three subsessions with two 20-min breaks
with food and refreshments in between. Prior to language testing and speech
elicitation, participants went through a hearing test with an OSCILLA SM910
screening audiometer, and a decrease of no more than 10 dB for one frequency
on one ear was considered acceptable. After the session, participants received
275

Language Learning 59:2, June 2009, pp. 249306

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

a financial compensation of SEK 500. Testing and data collection was carried
out by the same male native speaker of Stockholm Swedish who had conducted
the listening sessions in Part I.
Instruments
As the overarching aim of this part of the study was to scrutinize the participants
actual nativelikeness in a variety of linguistic phenomena and language abilities,
a large set of instruments was employed. These instruments were designed in
such a way that they would allow for differentiation between native-speaker
proficiency and near-native proficiency, on the one hand, and between different
degrees of near-native proficiency, on the other hand. The language tests and
tasks were deliberately made highly complex in order to cause a high degree
of difficulty and cognitive load even for native speakers. Measurements were
carried out with as much care and in as much detail as possible. The rationale
behind this design was the absolute need to avoid any possible ceiling effects,
which we believe have strongly influenced previous research and theorizing
(see Hyltenstam & Abrahamsson, 2000, 2001, 2003b, for discussions; see also
Montrul & Slabakova, 2003, p. 385).
Although the test battery contained some 20 different instruments for language testing and speech elicitation, the present article will report only on
the 10 measures that have been analyzed so far. However, the present set of
results cover speech production, speech perception, morphosyntax, and formulaic languagein other words, a fairly representative sample of the broad
spectrum of L2 knowledge and processing abilities. The 10 instruments and
methods of analysis are presented next.
Production and perception of voice onset time (VOT). Voiceless stops
are usually associated with longer VOT values and voiced stops with shorter
values, VOT being the time interval between the onset of the release burst of a
stop consonant and the onset of periodicity from vocal fold vibration. Spanish
and Swedish differ as to where on the VOT continuum the voiced/voiceless
categories separate: The Spanish category boundaries are located at lower
(usually negative) VOT values than is the case in languages like Swedish (and
English), for which boundaries are found at higher (positive) values (Lisker &
Abramson, 1964). The present study included two VOT-related measures based
on one production task and one test of categorical perception. In the production
task (Instrument 1), the participants read aloud the Swedish words par, tal, and
kal.15 Each participant read each word 10 times, and the readings were recorded.
Spectral analyses were then made of the initial voiceless stops /p/, /t/, and /k/
using the Soundswell package (Hitech Development) (for further details, see
Language Learning 59:2, June 2009, pp. 249306

276

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

Abrahamsson et al., in press; Stolten, 2005). The categorical perception test


(Instrument 2) was based on the minimal pairs par-bar, tal-dal, and kal-gal,16
which had been recorded in an anechoic chamber by a native female speaker of
Swedish. Using the Soundswell software, a 5-ms-step VOT continuum ranging
from 60 to +90 ms was created for all three minimal pairs, and the stimulus
items were presented through earphones in different random orders for all
participants. Each word was presented together with the carrier phrase Nu hor
du. . . Now you will hear. . ., and the participants task was to decide whether
they heard the voiceless or the voiced member of the word pair by pressing one
of two buttons. The test was designed and run in E-Prime and took about 5 min
to complete (for details, see Stolten, 2006).
Speech perception in noise. Nonnative listeners are generally less able to
take advantage of linguistic context to decode speech presented in noise, and
the (negative) effect of increasing noise is greater for nonnative than for native listeners (see Bradlow & Bent, 2002; Hyltenstam & Abrahamsson, 2003a;
McAllister, 1997; Spolsky, Sigurd, Sato, Walker, & Arterburn, 1968). The
present study included two different perception-in-noise tests: one of word perception in babble noise and one of sentence perception in white noise. In the
babble noise test (Instrument 3; for details, see McAllister & Brodda, 2002),
participants encountered (in earphones) 30 simple, highly frequent bisyllabic
stimulus words in increasing babble noise (i.e., noise consisting of multiple
voices). The words were randomly and automatically selected from 100 potential stimulus words and were presented together with the carrier phrase Nu
hor du. . . Now you will hear. . .. The words and the carrier phrase had been
recorded by a female native speaker in an anechoic chamber. The participants
task was to repeat each word, and the experimenter entered the response words
into the computer. An in-built metric (see MacAllister & Brodda, 2002) calculated the phonological distance between the stimulus word and the response
word, and the noise level was automatically adjusted accordingly. This means
that whenever the participant responded with the correct word (e.g., solen for
solen the sun) or with a phonologically similar word (e.g., stolen the chair
instead of solen the sun), the signal-to-noise ratio (SNR) increased by 0.4dB,
but decreased to the same extent whenever the stimulus word and the response
word were different and phonologically distant from each other (e.g., katten
the cat instead of solen the sun). In this way, a perceptual threshold level
could be established for each participant. The test took about 5 min to complete.
In the white noise test (Instrument 4), the participants encountered (through
earphones) 28 sentences (recorded by a male native speaker in an anechoic

277

Language Learning 59:2, June 2009, pp. 249306

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

chamber), containing both predictable and unpredictable information, and that


together formed an informational text on a fictional subject (adapted from
Platzack, 1973). The participants task was to repeat each sentence verbatim,
and their repetitions were recorded by the computer for later analysis. The first
seven sentences were presented without noise; thereafter, the level of white
noise successively increased after every seventh sentence, giving SNRs of 13
dB, 6 dB, and 2 dB, respectively. The test took about 5 min to complete. A
strict scoring procedure was then adopted, in which only exact repetitions were
scored as correct repetitions (for a similar approach, cf. Bradlow & Bent, p.
276). The seven noise-free sentences were excluded from the analysis.
Grammaticality judgment. In order to measure the participants grammatical L2 intuition and morphosyntactic processing ability, a comprehensive and
demanding grammaticality judgment test was administered in two versions: one
auditory (Instrument 5) and one in writing (Instrument 6). The test consisted
of 80 rather long (mean length: 17 words) and complex sentences based on
four morphosyntactic features of Swedish grammar: subject-verb inversion, reflexive possessive pronouns, placement of sentence adverbs in relative clauses,
and gender and number agreement (see Appendix B for sample items; for
further details, see Abrahamsson & Hyltenstam, 2008). Half of the sentences
were grammatically correct and half contained one grammatical error. The
sentences (which, in the auditory version, had been recorded in an anechoic
chamber by a female native speaker) were given in different random orders for
all participants. They were presented through earphones in the auditory version
and on the computer screen in the written version. By pressing one of two buttons at any point during or after a sentence, the participants indicated whether
they perceived it as grammatically correct or incorrect. Along with YES/NO
responses, the auditory GJT also registered reaction times (Instrument 7). Both
versions of the test were designed and run in E-Prime and each took 1520 min
to complete.
Grammatical, lexical, and semantic inferencing. A more global measure
of the participants L2 Swedish proficiency was obtained by the use of a cloze
test (Instrument 8). This technique, originally developed by Taylor (1953), has
been proven to mobilize the testees total grammatical, lexical, contextual, and
pragmatic knowledge (McNamara, 2000, p. 15), which in normal language use
is used in perception and comprehension of both spoken and written language.
L2 speakers, even at very advanced levels, have been observed to have greater
difficulties than native speakers in making semantically and syntactically based
predictions about a texts continuation (see, e.g., Hyltenstam & Abrahamsson,
Language Learning 59:2, June 2009, pp. 249306

278

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

2003b). The cloze test employed here consisted of a 300-word text (adapted
from Platzack, 1973) where every seventh word had been removed. The test
was an untimed pen-and-paper task in which the participants were to fill in the
42 blanks with words that would fit into the context. The participants performances were then blind scored by both authors independently. Responses in
the form of words other than the original ones (i.e., those hidden by the blanks)
were judged for lexical, morphosyntactic, and semantic appropriateness with
respect to their linguistic context; encyclopaedic errors were not considered
(e.g., if someone with poor knowledge in modern history would fill in the word
started in the sentence World War II ____ in 1945). Disagreements in judgment
were settled through discussion and careful consideration.
Formulaic language. Native speakers of a language frequently make use
of formulaic language, and both L1 and L2 learners rely heavily on prefabricated linguistic chunks in early phases of language development (Wray, 2002).
Paradoxically, however, for L2 learners of advanced proficiency, the idiomatic
use of formulaic language seems to be the biggest stumbling block to sounding nativelike (Wray, p. ix). The present study included two tests of formulaic
language: one of idioms (Instrument 9) and one of proverbs (Instrument 10).
Both tests were created and run in E-Prime, and they were identical in design
and procedure. Each test included 50 items, which were presented in writing
on the computer screen (one at a time and in the same order for all participants)
with a blank to be filled in (a missing word or chunk; e.g., Hon lopte verkligen
[linan] ut, roughly She really went the whole hog, and Ju fler kockar [desto
samre soppa] Too many cooks spoil the broth, where the words in brackets
represent the blank). The participants responded orally by reading the whole idiom or proverb including the missing word or phrase. Responses were recorded
and later analyzed. Both tests were timed, and participants were given 10 sec
to complete each item. The tests took each 78 min to complete.
Analysis
For the analysis, we used performance within the native-speaker (NS) range
as a way of defining nativelike behavior, and the lowest NS result on each of
the 10 measures was defined as the minimum criterion of nativelikeness of that
specific aspect of Swedish. This, of course, is a far more inclusive criterion
than those based on performance within, say, a 95% confidence interval of
native controls or within one or two standard deviations from the NS mean
(cf., e.g., Birdsong, 2007; Flege et al., 1999; Piske, MacKay, & Flege, 2001).
A nativelikeness criterion based on the NS absolute range should be viewed
as a stronger guarantee against Type I errors and false negatives (i.e., claims
279

Language Learning 59:2, June 2009, pp. 249306

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

of nonnativelikeness for L2 users who in fact behave like at least some native
speakers).
As mentioned earlier, the results generated from each of the 10 instruments
will not be presented in any exact detail here but rather in terms of whether
the participants passed for native speakers on how many and on which of the
instruments. Therefore, by analogy with the use of PN scores (010) earlier, a
participants score of scrutinized nativelikeness was indexed by the number
of instruments on which performance fell within the NS range and will be
referred to here as SN scores (010).
Results
The NS ranges and means as well as maximal results for the 10 linguistic
test instruments are presented in Table 6. As can be seen, especially for those
instruments with fixed maximum scores, the tests and tasks were difficult even
for the native speakers. None of the native speakers reached the maximum result
on any of Instruments 46 and 810, which serves as a guarantee against ceiling
effects (for similar arguments, see Montrul & Slabakova, 2003, p. 385). For
obvious reasons, neither the measures of production and perception of VOT,
tolerance for babble noise, nor reaction times allow for a maximum result.
Table 6 The ten instruments with maximum result and native-speaker mean, highest,
and lowest results
Instruments
1. VOT production
(% of word dur.)
2. VOT perception (ms.)

3. Babble noise (SNR, dB)


4. White noise (score)
5. Auditory GJT (score)
6. Written GJT (score)
7. RT, auditory GJT (ms)
8. Cloze test (score)
9. Idioms (score)
10. Proverbs (score)
a
b

/p/
/t/
/k/
/p-b/
/t-d/
/k-g/

Max.

NS Mean

NS High

NS Low

21
80
80

42
50
50

16.9
15.7
18.3
7.2
15.3
24.6
7.46
16
69
70
7,729
36
43
39

22.7
21.8
25.5
27.8
27.5
33.3
11.53
18
78
78
7,160
41
48
46

11.7
10.7
14.6
13.0
5.4
17.9
5.06
12
57
58
8,888b
30
33
33

Not applicable.
The lowest NS result is represented by the longest reaction time.

Language Learning 59:2, June 2009, pp. 249306

280

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

Figure 3 Scatter plot of SN scores versus AO for the 41 selected participants; all 10
linguistic instruments.

Furthermore, the NS ranges are exceedingly wide for many of the measures,
which, of course, offers the L2 participants good prospects for performing
within those ranges.
The overall results of the L2 participants are presented in Figure 3, which is
the equivalent to Figure 1 in Part I. Although the participants were selected on
the basis of a nativelikeness criterion (PN score 6), we still find a significant
difference in mean SN scores between the early and late learners, t(39) = 2.80,
p < .01 (two tailed), as well as a negative (albeit weak) correlation between
the AO and the SN score, r = .38, df = 39, p < .02. The weakness of this
correlation has a great deal to do with the two highest scoring participants in
the late-learner group, who, paradoxically, were those with the two highest AOs
(see below). By analogy with Figure 2 and Table 3 in Part I, Figure 4 and Table 7
provide a clearer illustration of the relation between the early and late learners
when divided into smaller AO groups. A one-way ANOVA test revealed that
there are significant differences between the groups, F(2, 41) = 3.892, p <
.03. However, a post hoc test (Fishers protected LSD) showed that the basis
281

Language Learning 59:2, June 2009, pp. 249306

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

Score of Scrutinized Nativelikeness

10
9
8
7

n = 15

n = 16

6
5

n = 10

4
3
2
1
0

Early childhood
(AO 1-5)

Late childhood
(AO 6-11)

Post-puberty
(AO 13-19)

Age of onset group


Figure 4 Average SN scores for all 41 selected subjects, divided into three AO
categories.
Table 7 Group comparisons of mean SN scores (AO 15, 611, and 1319 years) with
Fishers protected LSD post hoc test based on ANOVA: F(2, 41) = 3.892, p < .03
Participant
group 1

SD

Participant
group 2

SD

Fishers LSD

AO 15
AO 15
AO 611

15
15
16

6.1
6.1
5.8

2.4
2.4
2.8

AO 611
AO 1319
AO 1319

16
10
10

5.8
3.5
3.5

2.8
2.0
2.0

p = .718, ns
p < .02
p < .03

for this result is the differences between the adolescent/adult group and the
two early-learner groups; the marginal difference in mean SN scores between
the two early learner groups was not significant.17 Thus, as was the case with
perceived nativelikeness and PN scores in Part I of the study, the major changes
in actual nativelikeness and SN scores can be associated with AOs beyond 12
years.
Language Learning 59:2, June 2009, pp. 249306

282

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

Let us turn next to the individual level and the main question of this part of
the study, namely the incidence of actual nativelikeness. Of the 41 participants
selected, only two, possibly three,18 received a SN score of 10 (i.e., performed
within the range of the 15 native-speaker participants on all 10 measures of
Swedish proficiency). These learners AOs were 3, 7, and 8 years, respectively.
Three participants received an SN score of 9, another two received an SN
score of 8, and these learners AOs were between 1 and 7 years. Among the
10 late learners, the highest performing participant received an SN score of
7 and another received an SN score of 6. Interestingly, these two were those
with the highest AOs: 19 and 17 years, respectively. The remaining eight late
learners received SN scores of 15. Thus, there was no evidence of actual
nativelikeness among any of the late learners, and only few childhood learners
exhibited nativelike results across the board.
Table 8 exhibits individual results for participant and for each instrument.
The results are expressed in terms of +/ within the NS range, where + stands
for results at or above the lowest NS result and stands for results below the
lowest NS result. In the few cases where data are missing (due to, e.g., technical
problems or uninterpretable responses), results are treated as if they were within
the NS range, but they are, for the sake of clarity, expressed with (+). As the
results show, the 10 instruments represent different degrees of difficulty; for
example, whereas 35 of the 41 participants had GJT reaction times (Instrument
7) within the NS range, only 6 performed within the NS range on the proverb
test (Instrument 10). However, there are no linguistic domains or tasks that
were never mastered, not even among the late learners; in other words, every
instrument is marked with at least one +. For example, participant 070 (AO 19)
exhibited nativelike knowledge and behavior on measures of morphosyntax,
formulaic language, and sentence repetition in white noise but not on details
of speech production and perception, whereas several other late learners had
nonnativelike results on the morphosyntactic tests but at the same time exhibited
nativelike behavior on at least one of the phonetic measures. Similarly divergent
patterns can be discerned among the early learners.
Nevertheless, some interesting differences can be observed concerning
which of the linguistic instruments cause the least and the most difficulty
among early and late learners, respectively. In Table 9, the 10 instruments have
been rank-ordered according to the percentage of learners within each group
who performed within the NS range. As can be seen, both groups exhibited most
nativelike behavior where reaction times on the auditory GJT was concerned,
whereas both groups showed least nativelike performance on the proverb test.
However, the main difference between the ranks concerns the relation between
283

Language Learning 59:2, June 2009, pp. 249306

AO

1
1
1
2
2
2
3
3
3
4
4
4
5
5
5
6
6
6
7

ID

122
002
049
030
126
100
012
043
041
052
051
090
007
101
127
013
031
118
089

2. VOT
perc.

+
+

+
+
+
+
+
+
(+)
+
+
+
+

1. VOT
prod.

+
+

+
+
+
+
(+)
(+)
+

+
+
+
+

+
+
+
+
+
+
+
+
+

+
+
+

3. Babble
noise
+
+
+
+

+
+

+
+

+
+

4. White
noise
+
+
+
+
+

+
+

+
+
+
+
+

5. GJT
(aud.)
+
+
+
+
+
+
+

+
+

+
+
+

6. GJT
(wri.)

Instruments

+
+
+
+
+
+
+
+
+
+
+

+
+
+
+
+

7. RT
(aud.)

Table 8 Overall results for the 41 participants on the 10 measures of L2 Swedish proficiency

+
+

+
+

+
+

+
+

8. Cloze
test
+
+

+
+

+
+
+
+
+
+

9. Idioms

10. Proverbs

9
8
5
7
7
4
10
(4)
(5)
9
6
2
4
4
(8)
7
7
7
10

SN

Abrahamsson and Hyltenstam

Language Learning 59:2, June 2009, pp. 249306

AO and Nativelikeness in an L2

284

285

7
7
8
8
8
9
9
9
10
10
11
11
13
13
14
14
14
15
15
16
17
19

+
+

+
+

(+)
+
+
+
+

(+)

+
+

+
+
+

+
+
+

+
+

(+)

+
+

+
+

+
+

+
+

+
+

+
+
+

+
+

+
+
+
+
+
+
+
+
+
+
+
+
+

+
+

+
+

+
+

+
+
+

+
+

+
(+)

+
+

(+)

+
9
5
3
2
(10)
2
2
(7)
5
3
7
7
3
1
5
4
2
3
(3)
1
6
7

Note. +/ = results within/below NS range; (+) = data missing; ID = participant identification number; AO = age of onset; SN = score of
scrutinized nativelikeness (number of tests within NS range).

015
042
096
081
076
086
188
194
157
045
016
033
107
114
145
001
180
173
102
103
172
070

Abrahamsson and Hyltenstam


AO and Nativelikeness in an L2

Language Learning 59:2, June 2009, pp. 249306

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

Table 9 The rate of nativelike attainment within different linguistic domains; the ten
instruments rank-ordered per AO group
Percent of participants
within NS range
94%
74%
71%
68%
65%
60%
58%
58%
52%
50%
50%
50%
48%

AO 111
(n = 31)
RTs (GJT, auditory)
VOT production
VOT categorical perception
Word percentage in babble
noise
GJT (written)

RTs (GJT, auditory)


GJT (auditory)
Idioms
Cloze test
(grammar/semantics)
GJT (written)
GJT (auditory)
Cloze test (gramm./sem.)
Sentence percentage in
white noise

40%
30%

VOT production
Word percentage in babble
noise
Sentence percentage in
white noise
Idioms
VOT categorical perception

30%
20%
20%
16%
10%

AO 1319
(n = 10)

Proverbs
Proverbs

phonetic and grammatical aspects of Swedish. Among the early learners, nativelike VOT production and perception, as well as word perception in babble noise,
was more common than nativelike grammatical intuition. The reverse pattern
is found for the late learners: The formal grammatical aspectsrepresented by
the two GJTs and the cloze testare all ranked higher than the pure phonetic
aspects, which all appear in the lower half of this groups rank order. As can be
seen in Table 10, the differences between the early and late learners that were
statistically significant concern speech production and perception (i.e., Instruments 13) as well as reaction times (Instrument 7) and idiomatic expressions
Language Learning 59:2, June 2009, pp. 249306

286

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

Table 10 Differences in rate of nativelikeness between AO 11 learners and AO


13 learners on each of the 10 measures; number (%) of participants; 2 test (n = 41,
df = 1)
Instrument

AO 111
(n = 31)

AO 1319
(n = 10)

1. VOT production
2. VOT perception
3. Babble noise test
4. White noise test
5. GJT (auditory)
6. GJT (in writing)
7. RT (aud. GJT)
8. Cloze test
9. Idioms
10. Proverbs

23 (74%)
22 (71%)
21 (68%)
15 (48%)
18 (58%)
19 (65%)
29 (94%)
16 (52%)
18 (58%)
5 (16%)

4 (40%)
2 (20%)
3 (30%)
3 (30%)
4 (40%)
5 (50%)
6 (60%)
5 (50%)
2 (20%)
1 (10%)

3.931
8.092
4.437
1.038
0.992
0.397
6.812
0.008
4.385
0.227

<.05
<.01
<.05
>.1, ns
>.1, ns
>.1, ns
<.01
>.1, ns
<.05
>.1, ns

(Instrument 9). The differences for the more formal linguistic aspects, such as
morphosyntax (Instruments 5, 6, and 8) and proverbs (Instrument 10), did not
reach significance.
Summary
In Part II of the study, a subset of 31 childhood learners (AO 111 years)
and 10 adolescent and adult learners (AO 1319 years) who had passed for
native speakers with at least 6 of the 10 judges in Part I were selected for a
broad and detailed scrutiny of actual (linguistic) nativelikeness. Of these, only
two, possibly three, performed within the range of the 15 native participants
on all 10 measures of Swedish proficiency. These learners AOs were 3, 7,
and 8 years. Of the 10 late learners, 1 performed within the range of native
speakers on seven measures, 1 on six measures, and the remaining 8 performed
within the native-speaker range on one to five measures. In other words, only
a few of the early learners and none of the late learners exhibited actual,
linguistic nativelikeness across a broad range of tasks when their performance
was scrutinized in detail.
Discussion
The first important finding of the present study concerns the distribution and
incidence of nativelikeness across AOs of acquisition. At first glance, the results
seem to be compatible with those of earlier studies. First, we saw a strong
287

Language Learning 59:2, June 2009, pp. 249306

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

negative correlation (r = .72) between perceived nativelikeness and the age


at which L2 acquisition began, and an overall group comparison revealed a
significant difference in perceived nativelikeness between early and late learners
(cf., e.g., Asher & Garca, 1969; DeKeyser, 2000; Hyltenstam & Abrahamsson,
2003a; Flege et al., 2005; Johnson & Newport, 1989; MacKay et al., 2006;
Munro & Mann, 2005; Oyama, 1976, 1978; Patkowski, 1980; Seliger et al.,
1975). However, there were no differences between early and late childhood
learners (AO < 15 vs. 611) or between early and late adulthood learners
(AO 1823 vs. 2447); the only significant differences were that between
child learners and adolescent learners (AO 1217) and that between adolescent
learners and adult learners. The average perceived nativelikeness began to
decrease at or around AO 12 years, but this decrease leveled out at some
point after adolescence (cf. the sigmoid function of AO and degree of accent
suggested by Munro & Mann). Furthermore, and in accordance with many
previous studies (e.g., Johnson & Newport; Oyama 1976, 1978; Patkowski),
no clear connections could be observed between nativelikeness and LOR or
length of L2 exposure. In other words, and as has been shown in many studies
(e.g., Johnson & Newport; Munro & Mann), AO of L2 acquisition stands out
as the variable that best predicts ultimate perceived nativelikeness. The present
study actually shows this to be the case even when the speaker sample consists
exclusively of L2 learners who identify themselves as potentially nativelike or
near-native speakers.
Also in accordance with previous research (as well as with most layman
observations), a majority of the early learners were perceived as native speakers,
whereas most of the late learners were thought to have a native language other
than Swedish (cf., e.g., DeKeyser, 2000; Flege, 1999; Flege et al., 1999; Johnson
& Newport, 1989; Patkowski, 1980). On the other hand, 5 of the 88 late learners
actually passed for native speakers. This undeniably would appear to support
the claims of some researchers that nativelike adult learners do exist and,
consequently, indicate the nonexistence of a biologically determined critical
period for (second) language acquisition. However, such an interpretation of
our results would, for several reasons, be most unwarranted and even faulty.
First, it is important to note that these five individuals were found exclusively
among those with AOs 1217 years and that no participant among the 57
candidates with an AO beyond 17 years passed for a mother-tongue speaker
of Swedish. This aspect of our results is, at least in part, congruent with the
results in a study by Flege et al. (1995), in which only a handful of the late
Italian learners of English were judged to speak the L2 without a foreign accent,
although none of these had an AO of acquisition above 16 years. In fact, in
Language Learning 59:2, June 2009, pp. 249306

288

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

the study by Flege et al. (1999), none of the Korean participants with an AO
beyond 10 years spoke L2 English without a foreign accent.
Second, the results from the 10 different measures of L2 knowledge and
processingrepresenting various aspects of speech production and perception,
morphosyntax, and formulaic languageshowed that none of the late learners (AO 1319) exhibited actual, overall linguistic nativelikeness when their
performance was scrutinized in detail. These results support the position held
by, for example, Bley-Vroman (1989), Long (1990), Gregg (1996), Long and
Robinson (1998), as well as ourselves (Hyltenstam & Abrahamsson, 2003b),
namely that nativelike L2 proficiency is, in principle, never attained by adult
learners. On the other hand, the results do not seem to support the claim that
certain linguistic features would be unlearnable after a certain age while the
acquisition of other aspects remain unaffected by the age of the learner. Neither
do they support the theoretical position (taken by, e.g., Scovel, 1988) that a
critical period is of relevance only for the phonetic/phonological domains of
language but not for morphosyntax, even if the late learners in this study were
actually shown to be less nativelike when it came to speech production and
perception as compared to the more formal, grammatical levels of Swedish.
However, what our data do show is that even if a nativelike mastery of any linguistic aspect of an L2 is indeed possible, even for late learners, the probability
of a late learner developing a nativelike command of all (or even a majority of)
relevant linguistic aspects (and across all linguistic domains, too) is close to
zero. We therefore believe that one can (and should) remain skeptical toward
any claims of absolute nativelikeness in adult learners and toward the rejections
of the CPH that tend to follow such claims, especially if based solely on perceived nativelikeness or on the apparent incidence of adult nativelike behavior
in certain linguistic domains.
Another central result of the present study concerns the incidence of nativelikeness in early learners. The results of Part I revealed that even if most
child learners were perceived as native speakers of Swedish by most judges,
this was far from the case with all of them. In fact, as many as 41 of the
107 candidates in this AO category were perceived as having a mother tongue
other than Swedish. In addition, even though 25 of the 31 early learners who
were selected for participation in Part II were perceived as native speakers by
9 or 10 native judges in Part I, only three of these learners performed within
the native-speaker range on all 10 measures of Swedish proficiency. In most
previous studies, early learners with less than nativelike L2 proficiency have
either not been identified at all or, in a few studies, only as single exceptions
(e.g., DeKeyser, 2000; see also Ioup, 1989; Obler, 1989). Only in recent years
289

Language Learning 59:2, June 2009, pp. 249306

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

have researchers been able to present results that suggest much higher rates of
nonnativelike early learners (e.g., Hyltenstam & Abrahamsson, 2003a; Butler,
2000; Ekberg, 2004; Flege, Freida, & Nozawa, 1997; Flege et al., 1999; Lee
et al., 2006; MacKay et al., 2006; McDonald, 2000; Tsukada et al., 2005; for
a relatively early report in this direction, see Hyltenstam, 1992). For example,
Flege et al. (1997) observed in their L2 participants a noticeable foreign accent in English, even when their L2 acquisition had begun at ages 56 years
and despite them having used the language in an L2 environment for 34 years
on average. Furthermore, in a detailed acoustic analysis of the production of
unstressed English vowels by early and late Korean and Japanese bilinguals,
Lee, Guion & Harada showed that except for the nativelike behavior concerning
fundamental frequency (F0), the Korean late and early learners were nonnativelike where duration, intensity and vowel-quality reduction was concerned; the
Japanese late and early learners were nonnativelike on vowel-quality reduction
only.
These results suggest that one may consider it a myth that L2 learning that
begins in childhood, easily, automatically, and inevitably results in nativelikeness (cf. also Harley & Wang, 1997, p. 44); in fact, the present study suggests
that nativelike L2 proficiency in individuals with low starting ages is considerably less common than has been assumed earlier. Irrespective of whether this
fact ought to be explained with a theory of nonmaturational factors or with
one of maturational constraints operating successively at early ages (for a detailed discussion, see Hyltenstam & Abrahamsson, 2003b, pp. 553ff, 569ff, and
572ff), one may safely conclude that an early AO of acquisition is a necessary
although not sufficient requirement for nativelike ultimate attainment in an L2
(cf. Hyltenstam, 1992; Hyltenstam & Abrahamsson, 2003a, 2003b).
That most early learners in previous studies have uniformly passed for
native speakers (e.g., by exhibiting test results comparable to those of native
controls) is most probably due to ceiling effects, which, in our view, is also
the reason why a handful of individual adult learners have been classified as
nativelike speakers in previous research. In other words, we believe that the
tests and measures adopted have not been sufficiently demanding and that the
analyses used have not been sophisticated enough to allow for discrimination
between native and near-native levels of language proficiency. As an example,
consider the type of sentences and structures used in Johnson and Newports
(1989) grammaticality judgment test (also used in the various replications of
this study; see above). The 276 test sentences were quite short and structurally
simple, as illustrated by the following ungrammatical examples (pp. 7377):

The farmer bought two pig at the market, The little boy is speak to a policeman,
Language Learning 59:2, June 2009, pp. 249306

290

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

Yesterday the hunter shoots a deer, Susan is making some cookies for we, Can
ride the little girl a bicycle? and Martha a question asked the policeman.19
Obviously, test items of this kind cannot be used in studies of near-native L2
speakers without resulting in full scores for all participants. Had we decided
to administer a Swedish version of the Johnson and Newport grammaticality
judgment test to our participants, we would have expected all of them to pass
for native speakers, irrespective of age of learning.20 We therefore suggest that
many, if not all, of the early learners in Johnson and Newports study can be
considered false positives (cf. Long, 2005).
The question remains, however, why Selinkers unverified 5% estimation
concerning nativelike adult learners is still interpreted as a fact by laymen and
researchers, and why even higher rates of nativelikeness are still being suggested. We believe several factors have contributed to this. First, individual
perceptions and subjective judgments of nativelikeness may be highly influential. One must bear in mind that even though there were relatively few adolescent or adult learners who passed for native speakers in the present listening
sessions, there were relatively few individualsonly 39 of 195 in totalwho
were not perceived as native speakers by any of the 10 native judges; that is,
156 participants were actually perceived as native speakers by one, sometimes
several, of the native listeners. This means that an L2 speaker who most native
speakers perceive as a nonnative speaker can still be perceived, interpreted, and
described as a nativelike speaker by individual reporterslaymen as well as
linguistsand thereby also be presented as evidence against a critical period
for language acquisition.
Second, varying definitions of nativelikeness certainly have had an influence on the different rates of nativelikeness reported. Studies that have used
either self-evaluation or native-speaker judgments typically report quite high
rates of nativelikeness (e.g., Bongaerts, 1999; Seliger et al., 1975). However,
as revealed by the different results of Part I versus Part II of the present study,
neither a definition based on self-identification nor one based on identification by others can reliably approximate the incidence of nativelikeness. Sorace
and Robertson (2001) stated that non-native grammars may exhibit certain
subtle features that distinguish them from native grammars (p. 266) and that
even learners who are capable of native-like performance often have knowledge representations that differ systematically from those of native speakers
(p. 266). In a similar vein, Bley-Vroman (1989) argued that no adult learners attain nativelike levels of L2 competence, even though some may have
a performance difficult to distinguish from that of native speakers (p. 44).
Earlier we have introduced the concept of non-perceivable non-nativeness
291

Language Learning 59:2, June 2009, pp. 249306

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

(Hyltenstam & Abrahamsson, 2003b)that is, nonnativelikeness that cannot


easily be detected in everyday conversation. Despite the somewhat lumbering
terminology, the concept of non-perceivable non-nativeness captures an important insight into the various rates of nativelikeness that have been suggested
for adult L2 learners (such as Selinkers 5%), namely that they denote the incidence of near-nativeness rather than nativelikeness (cf. also Hyltenstam &
Abrahamsson, 2003b; Long & Robinson, 1998). A majority of native judges in
Part I of the present study could not detect the actual nonnativelikeness of the
10 late learners revealed in Part II; nor could the actual nonnativelikeness of
the early learners be detected by the judges when the basis for their judgments
was spontaneous and casual speech.
Finally, as has already become clear from our discussion, we are convinced
that the somewhat liberal levels of scrutiny of previous studies (with a few
exceptions; e.g., Ioup et al., 1994) have resulted in ceiling effects and thereby
Type II errors and false positives. Obviously, the target of analysis must not
be the most basic rules, structures, or skills because much of the divergence
between the very advanced L2 speakers and the native speakers proficiency
consists of nonovert and/or low-frequency phenomena. Therefore, research on
nativelikeness and advanced learners L2 ultimate attainment (regardless of AO
of acquisition) calls for a much higher sensitivity of the tests and instruments
than does, for example, research on the initial phases of interlanguage development. Only after the detailed scrutiny in Part II of the present study could
the L2 knowledge and behavior of our participants be distinguished from that
of native speakers.
To summarize our interpretation, the reason nativelike adult L2 learners
are still treated by many SLA researchers as quite ordinary occurrences
(Bialystok, 1997, p. 134) is a combination of several factors: on one hand,
personal, subjective, and unverified observations, and, on the other hand, empirical results based on either inappropriate definitions of nativelikeness or
insufficiently sophisticated techniques for linguistic scrutiny.
Some concern, and even some harsh criticism, has been raised against
the research agenda represented by the present study, and such criticism has
generally targeted the scope and degree of scrutiny in the analysis. For example,
Davies (2003) wrote that for the psycholinguist, no test is ever sufficient to
demonstrate conclusively that native speaker and nonnative speaker are discrete:
when nonnative speakers have been shown to perform as well as a native
speaker on a test, the cry goes up for yet another test (p. 213). In the same
spirit, Birdsong (2005b) warned us that the acid test of nativelikeness runs
the risk of being over-applied [such that] individuals who have demonstrated
Language Learning 59:2, June 2009, pp. 249306

292

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

nativelikeness in several areas of experimental performance could be subjected


to even further poking and prodding, until a betraying shibboleth is found
(p. 322). His view is that it would be a disservice to the scientific process to
insulate the CPH/L2A from falsifiability by adding task upon task and measure
upon measure to the nativelikeness criterion (p. 322).
If we were to agree with Birdsong that perhaps some kind of line ought to
be drawn somewhere, we would, at the same time, have to realize that we are
in no position to draw such a line yet; as we see it, research on nativelikeness
has only begun. For the moment, all we can say about that line is that it should
be drawn far beyond measures of nativelikeness based on self-identification,
far beyond measures of the phonetic ability to imitate native speakers or to
pass for native speaker on the basis of language-like behavior, far beyond
linguistic representation and UG constraints, and far beyond crude measures
of a limited set of linguistic phenomena. It must be remembered that we are
actually dealing with two of the most central and crucial questions in linguistics
and SLA, namely Can L2 learners ever attain nativelike proficiency? and Is
there a critical period for (second) language acquisition? Given that the null
hypothesis states that there are no differences between native speakers and
(adult) seemingly nativelike L2 speakers, it would be a greater disservice to
the scientific process if we, as researchers, chose not to do our best in trying to
reject it.
Summary and Conclusion
Perceived and actual (linguistic) nativelikeness was investigated in a sample of
the population advanced early and late L2 learners who perceive themselves as
potentially near-native or even nativelike speakers of Swedish. This was done
in two steps: first through listening sessions with a large sample of L2 speakers
(n = 195; AO < 147 years) and native judges, and then through broad and
detailed linguistic analyses of a subset of participants (n = 41; AO 119 years)
who had passed for native speakers in the listening sessions. Results revealed,
first, that a majority of the early learners but only a few of the late learners were
perceived as mother-tongue speakers of Swedish and, second, that only a few
of the early learners and none of the late learners exhibited actual, linguistic
nativelikeness across the board when their performance was scrutinized in
detail. The highest performing late learner exhibited results within the nativespeaker range on 7 of the 10 measures of L2 proficiency, and her deviance from
native-speaker norms was limited to phonetic aspects of speech production and
perception.
293

Language Learning 59:2, June 2009, pp. 249306

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

Of course, we are in no position to offer any percentages for the occurrence


of nativelike ultimate attainment, neither for adult nor child learners and neither
for perceived nor actual nativelikeness. What we can offer instead is a number
of empirically founded reasons for treating existing estimates and rates of
nativelikeness with caution or even scepticism. Actually, we do not see revised
nativelikeness rates as something desirable, as estimates in the form of exact
percentages run the risk of being circulated as established facts. On the contrary,
we would like carefully to concur in Munro and Manns (2005) comparatively
sober assertion that [no] model of an ageaccent connection should ever hope
to claim before age X, a person is guaranteed to develop a native accent and,
after age Y, a foreign accent is unavoidable (p. 337). Nevertheless, Selinkers
(1972) guess, that as much as 5% of the adult L2 learning population attain
absolute success, clearly is a gross overestimation of the actual situation;
our results point more in the direction that absolute nativelikeness in late
learners, in principle, does not occur. This, however, does not mean that we see
highly successful adult learners as theoretically uninteresting exceptions; to the
contrary, we believe that this population of L2 learners has a highly important,
not to say crucial, role to play in SLA theory buildingnot least concerning
the question of a critical period for language acquisition.
Revised version accepted 2 June 2008

Notes
1 In accordance with previous discussions (e.g., Hyltenstam & Abrahamsson, 2000,
2001, 2003b), our definition of near-nativeness throughout the present article
will be levels of nonnativeness that are nonperceivable in normal, everyday
language use.
2 As Neufeld (1979) demonstrated, an adult speaker may well learn to imitate
utterances in another language (even when being unaware of the meaning or
linguistic structure of the strings of speech imitated), and imitations may
sometimes be of such a quality that native listeners become convinced that they
originate from a native speaker.
3 It is important to note, however, that one of the two high-performing participants in
Birdsongs study had received 1 year of university-level phonetic instruction,
whereas the other worked as an actress in Parisian theater, a job that certainly
requires an unusually high degree of training in idiomatic pronunciation.
Furthermore, although Birdsong saw their academic exposure prior to, or during,
their residence in France as insignificant, one of them had, in fact, received 3 years
of high-school French, beginning at age 14 years and 1 year of college French at
Language Learning 59:2, June 2009, pp. 249306

294

Abrahamsson and Hyltenstam

10

295

AO and Nativelikeness in an L2

age 21 years; the other participant had 3 years of college-level French in France
beginning at age 20 years.
The L2 participants were highly educated and were recruited from among
instructors, professors and advanced undergraduate students in Spanish language
programs at three major research universities in the United States (p. 366).
The newspapers were Metro, a free morning paper with 625,000720,000 daily
readers (the Stockholm edition), distributed in the Stockholm public transportation
system (subway, buses, etc.) as well as in shopping malls and other public places,
and Aftonbladet, the leading tabloid newspaper in Sweden, with 325,000 daily
readers in the Stockholm area. The two Metro advertisements (September 16, 2002
and April 16, 2003) were both 125 182 mm in size, and the Aftonbladet
advertisement (March 15, 2004) was 250 100 mm.
Although the participants with the earliest AOs (<12) can be considered to have
simultaneously acquired two languages, we have chosen to disregard the distinction
simultaneous/successive bilingualism in this study. First, the issue affects only a
minority of our participants, but, second, and more importantly, it remains an open
empirical question what the effects are of even a minimal delay in L2 exposure.
In other words, we believe that the possibility of subtle dialectal variation may be
an important confounding factor contributing to the relatively high incidence of
nativelike adult learners reported in studies using Interpretation 2 of nativelikeness
above (i.e., nativelikeness as perceived by native speakers).
E-Prime (Psychology Software Tools, Inc.; Schneider, Eschman, & Zuccolotto,
2002a, 2002b) is one of the most user-friendly and widely used PC softwares
within psychological/behavioral experimental research; for a review, see Marinis
(2003, pp. 157158).
Each choice caused four new alternatives to appear on the screen from which the
judges were to indicate their degree of certainty. This was done by pressing one of
four keys associated with the following alternatives: (a) and Im absolutely sure,
(b) and Im fairly sure, (c) but Im fairly unsure, and (d) but Im very unsure.
This part of the procedure was originally included in order to make the task less
trivial, to stimulate careful judgments from the listeners, and to produce a more
finely grained quantification of the judgments. However, as being fairly sure/fairly
unsure/very unsure about an alternative does not reveal anything about the
specific cause of listener uncertainty (i.e., whether the uncertainty concerns the
choice between foreign accent and dialectal variation, between Stockholm
pronunciation and dialectal variation, or between Stockholm pronunciation and
foreign accent), this measure turned out to be useless for this part of the study and
was therefore excluded in the analysis. Nevertheless, we believe that this part of the
procedure successfully made the task less trivial and that it might have helped in
promoting careful and well-balanced decisions from the judges.
See Munro and Mann (2005), who used the notion degree of perceived accent
(DPA) when focusing on pronunciation only. However, the focus of the present
Language Learning 59:2, June 2009, pp. 249306

Abrahamsson and Hyltenstam

11

12
13

14

15
16

AO and Nativelikeness in an L2

study is not on accent per se but on the broader concept of nativelikeness,


including various levels of linguistic performance.
In both of these cases the degree of certainty was the lowestthat is . . .but Im
very unsure. The fact that all judges consistently perceived the native control
speakers as mother-tongue speakers of Swedish should be seen as additional
evidence for a high degree of reliability and validity of the listener judgments.
Obviously, native speaker and native-speaker proficiency seem to be concepts
that carry psychological relevance for native listeners, even if such concepts may
be difficult to define theoretically (see discussions in Birdsong, 2004; Cook, 1999;
Davies, 2003).
These differences remain significant with Tukeys HSD and Bonferroni post hoc
tests.
However, when checked with Tukeys HSD and Bonferroni post hoc tests (which
are somewhat more conservative procedures for multiple comparison), the
difference between the native-speaker group mean and the early childhood group
mean did not reach significance (p = .154 and .246, respectively).
This result, of course, potentially constitutes a serious challenge to the various
critiques of Lennebergs (1967) original critical period hypothesis. First, it
challenges the abandonment of puberty (commonly referred to as age 1213 years)
as a valid end point of the critical period (cf., e.g., Johnson & Newport, 1989).
Second, it challenges the rejections of a critical period that are based on
nonobserved discontinuity at a certain age or within a certain age span (cf., e.g.,
Bialystok & Hakuta, 1999; Flege et al., 1999). Similarly, and most importantly
from our point of view, the curve in Figure 2 poses a problem for the hypothesis
that a maturationally constrained age function could be depicted as a linear decline
from birth (as suggested in Hyltenstan & Abrahamsson, 2003b). An
age-nativelikeness function with a marked slope from age 12 throughout
adolescence, but with only minor slopes (if not plateaus) during childhood and
adulthood, suggests instead that a blend between the critical period model (with a
cutoff point at, say, puberty) and the linear decline model (lacking discontinuity at
any point) would better describe the present result. In fact, the recent study by
Munro and Mann (2005) suggests that the curve describing the relationship
between age of learning (or, in their case, age of immigration) and degree of
perceived foreign accent, although heavily linear on a restricted range, seems to
be globally sigmoid (p. 337) (for similar patterns, see Flege et al., 1999). Despite
its potentially far-reaching theoretical consequences, this aspect of our data will
not, however, be developed further in the present article. A serious reevaluation of
puberty as an upper limit of a critical or sensitive period as well as suggestions of a
sigmoid decline model clearly require more focused investigation in future studies.
English translation: par pair, couple, tal speech, number, kal bare, bald.
English translation: bar bar, carried, naked, dal valley, gal crow(s) (verb,
pres.).

Language Learning 59:2, June 2009, pp. 249306

296

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

17 When checked with Tukeys HSD and Bonferroni post hoc tests, the difference in
means between the AO 611 and AO 1319 groups did not reach significance (p =
.062 and .074, respectively); however, the difference between AO 15 and AO
1319 remained significant (both p < .04).
18 Due to technical problems, data are missing from the participant with AO 8 for
Instruments 4, 9, and 10 (see Table 8).
19 Similarly short and structurally simple sentences were used by White and Genesee
(1996) as well as by Marinova-Todd (2003) in their studies of very advanced late
L2 learners.
20 In fact, if one was to use the original English version of the Johnson and Newport
(1989) test as a measure of nativelikeness, most seventh graders in the Swedish
school system would turn out to be nativelike speakers of English, which, of
course, would be a quite bizarre claim.

References
Abrahamsson, N., & Hyltenstam, K. (2008). The robustness of aptitude effects in
near-native second language acquisition. Studies in Second Language Acquisition,
30(4), 481509.
Abrahamsson, N., Stolten, K., & Hyltenstam, K. (in press). Effects of age on voice
onset time: The production of voiceless stops by near-native L2 speakers. In S.
Haberzettl (Ed.), Processes and outcomes: Explaining achievement in language
learning. Berlin: Mouton de Gruyter.
Asher, J., & Garca, G. (1969). The optimal age to learn a foreign language. Modern
Language Journal, 38, 334341.
Bialystok, E. (1997). The structure of age: In search of barriers of second language
acquisition. Second Language Research, 13, 116137.
Bialystok, E., & Hakuta, K. (1999). Confounded age: Linguistic and cognitive factors
in age differences for second language acquisition. In D. Birdsong (Ed.), Second
language acquisition and the critical period hypothesis (pp. 161181). Mahwah,
NJ: Lawrence Erlbaum.
Bialystok, E., & Miller, B. (1999). The problem of age in second-language acquisition:
Influences from language, structure, and task. Bilingualism: Language and
Cognition, 2, 127145.
Birdsong, D. (1992). Ultimate attainment in second language acquisition. Language,
68, 706755.
Birdsong, D. (1999). Introduction: Whys and why nots of the critical period hypothesis
for second language acquisition. In D. Birdsong (Ed.), Second language acquisition
and the critical period hypothesis (pp. 122). Mahwah, NJ: Lawrence Erlbaum.
Birdsong, D. (2004). Second language acquisition and ultimate attainment. In A.
Davies & C. Elder (Eds.), Handbook of applied linguistics. London: Blackwell.
297

Language Learning 59:2, June 2009, pp. 249306

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

Birdsong, D. (2005a). Interpreting age effects in second language acquisition. In J.


Kroll & A. De Groot (Eds.), Handbook of bilingualism: Psycholinguistic
perspectives (pp. 109127). Cambridge: Cambridge University Press.
Birdsong, D. (2005b). Nativelikeness and non-nativelikeness in L2A research. IRAL,
43, 319328.
Birdsong, D. (2006). Age and second language acquisition and processing: A selective
overview. Language Learning, 56, 949.
Birdsong, D. (2007). Nativelike pronunciation among late learners of French as a
second language. In O.-S. Bohn & M. Munro (Eds.), Second language speech
learning: The role of language experience in speech perception and production (pp.
99116). Amsterdam: Benjamins.
Birdsong, D., & Molis, M. (2001). On the evidence for maturational constraints in
second-language acquisition. Journal of Memory and Language, 44, 235249.
Bley-Vroman, R. (1989). What is the logical problem of foreign language learning? In
S. Gass & J. Schachter (Eds.), Linguistic perspectives on second language
acquisition (pp. 4168). Cambridge: Cambridge University Press.
Bongaerts, T., Mennen, S., & van der Slik, F. (2000). Authenticity of pronunciation in
naturalistic second language acquisition: The case of very advanced late learners of
Dutch as a second language. Studia Linguistica 54(2), 298308.
Bongaerts, T., Planken, B., & Schils, E. (1995). Can late learners attain a native accent
in a foreign language? A test of the critical period hypothesis. In D. Singleton &
Z. Lengyel (Eds.), The age factor in second language acquisition (pp. 3050).
Clevedon: Multilingual Matters.
Bongaerts, T., Planken, B., & Schils, E. (1997). Age and ultimate attainment in the
pronunciation of a foreign language. Second Language Research, 19, 447465.
Bongaerts, T. (1999). Ultimate attainment in L2 pronunciation: The case of very
advanced late L2 learners. In D. Birdsong (Ed.), Second language acquisition and
the critical period hypothesis (pp. 133159). Mahwah, NJ: Lawrence Erlbaum.
Bradlow, A. R., & Bent, T. (2002). The clear speech effect for non-native listeners.
Journal of the Acoustical Society of America, 112, 272284.
Butler, Y. G. (2000). The age effect in second language acquisition: Is it too late to
acquire native-level competence in a second language after the age of seven? In Y.
Oshima-Takane, Y. Shirai, & H. Sirai (Eds.), Studies in language sciences 1 (pp.
159169). Tokyo: The Japanese Society for Language Sciences.
Colantoni, L., & Steele, J. (2006). Native-like attainment in the L2 acquisition of
Spanish stop-liquid clusters. In C. A. Klee & T. L. Face (Eds.), Selected proceedings
of the 7th conference on the acquisition of Spanish and Portuguese as first and
second languages (pp. 5973). Somerville, MA: Cascadilla Proceedings Project.
Cook, V. (1999). Going beyond the native speaker in language teaching. TESOL
Quarterly, 33, 185209.
Coppieters, R. (1987). Competence differences between natives and near-native
speakers. Language, 63, 544573.
Language Learning 59:2, June 2009, pp. 249306

298

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

Cranshaw, A. (1997). A study of Anglophone native and near-native linguistic and


metalinguistic performance. Unpublished doctoral dissertation, University of
Montreal, Canada.
Cunningham-Andersson, U., & Engstrand, O. (1989). Perceived strength and identity
of foreign accent in Swedish. Phonetica, 46, 138154.
Davies, A. (2003). The native speaker: Myth and reality. Clevedon, UK: Multilingual
Matters.
DeKeyser, R. M. (2000). The robustness of critical period effects in second language
acquisition. Studies in Second Language Acquisition, 22, 499533.
Ekberg, L. (2004). Grammatik och lexikon I svenska som andrasprak pa nastan infodd
niva. In K. Hyltenstam & I. Lindberg (Ed.), Svenska som andrasprak i forskning,
undervisning och samhalle (pp. 221258). Lund: Studentlitteratur.
Epstein, S., Flynn, S., & Martohardjano, G. (1996). Second language acquisition:
Theoretical and experimental issues in contemporary research. Behavioral and
Brain Sciences, 19, 677758.
Eubank, L., & Gregg, K. R. (1999). Critical periods and (second) language acquisition:
Divide et impera. In D. Birdsong (Ed.), Second language acquisition and the
critical period hypothesis (pp. 6599). Mahwah, NJ: Lawrence Erlbaum.
Flege, J. E. (1984). The detection of French accent by American listeners. Journal of
the Acoustical Society of America, 76, 692707.
Flege, J. E. (1999). Age of learning and second language speech. In D. Birdsong (Ed.),
Second language acquisition and the critical period hypothesis (pp. 101132).
Mahwah, NJ: Lawrence Erlbaum.
Flege, J. E., Frieda, E. M., & Nozawa, T. (1997). Amount of native-language (L1) use
affects the pronunciation of an L2. Journal of Phonetics, 25, 169186.
Flege, J. E., Munro, M. J., & MacKay, I. R. A. (1995). Factors affecting degree of
perceived foreign accent in a second language. Journal of the Acoustical Society of
America, 97, 31253134.
Flege, J. E., Yeni-Komshian, G. H., & Liu, S. (1999). Age constraints on
second-language acquisition. Journal of Memory and Language, 41, 78104.
Gregg, K. R. (1996). The logical and developmental problems of second language
acquisition. In W. C. Ritchie & T. K. Bhatia (Eds.), Handbook of second language
acquisition (pp. 4981). San Diego: Academic Press.
Harley, B., & Wang, W. (1997). The critical period hypothesis: Where are we now? In
de A. Groot & J. Kroll (Eds.), Tutorials in bilingualism: Psycholinguistic
perspectives (pp. 1951). Mahwah, NJ: Lawrence Erlbaum.
Hyltenstam, K. (1992). Non-native features of near-native speakers. On the ultimate
attainment of childhood L2 learners. In R. J. Harris (Ed.), Cognitive processing in
bilinguals (pp. 351368). Amsterdam: Elsevier Science.
Hyltenstam, K., & Abrahamsson, N. (2000). Who can become native-like in a second
language? All, some, or none? On the maturational constraints controversy in
second language acquisition. Studia Linguistica, 54(2), 150166.
299

Language Learning 59:2, June 2009, pp. 249306

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

Hyltenstam, K., & Abrahamsson, N. (2001). Age and L2 learning: The hazards of
matching practical implications with theoretical facts. (Comments on Stefka H.
Marinova-Todd, D. Bradford Marshall, and Catherine E. Snows Three
misconceptions about age and L2 learning). TESOL Quarterly, 35(1), 151
170.
Hyltenstam, K., & Abrahamsson, N. (2003a). Age of onset and ultimate attainment in
near-native speakers of Swedish. In K. Fraurud & K. Hyltenstam (Eds.),
Multilingualism in global and local perspectives. Selected papers from the 8th
Nordic conference on bilingualism, November 13, 2001, Stockholm Rinkeby
(pp. 319340). Stockholm: Centre for Research on Bilingualism, Stockholm
University, and Rinkeby Institute of Multilingual Research.
Hyltenstam, K., & Abrahamsson, N. (2003b). Maturational constraints in SLA. In C. J.
Doughty & M. H. Long (Eds.), The handbook of second language acquisition (pp.
539588). Oxford: Blackwell.
Ioup, G. (1989). Immigrant children who have failed to acquire native English. In S.
Gass, C. Madden, D. Preston, & L. Selinker (Eds.), Variation in second language
acquisition: Vol. 2. Psycholinguistic issues (pp. 160175). Clevedon, UK:
Multilingual Matters.
Ioup, G., Boustagui, E., El Tigi, M., & Moselle, M. (1994). Reexamining the critical
period hypothesis: A case study in a naturalistic environment. Studies in Second
Language Acquisition, 16, 7398.
Johnson, J. S., & Newport, E. L. (1989). Critical period effects in second language
learning: The influence of maturational state on the acquisition of English as a
second language. Cognitive Psychology, 21, 6099.
Lee, B., Guion, S. G., & Harada, T. (2006). Acoustic analysis of the production of
unstressed English vowels by early and late Korean and Japanese bilinguals. Studies
in Second Language Acquisition, 28, 487513.
Lenneberg, E. (1967). Biological foundations of language. New York: Wiley.
Lisker, L., & Abramson, A. (1964). A cross-language study of voicing in initial stops:
Acoustical measurements. Word, 20, 384422.
Liu, Y.-T. (2006). Specifying the norms of successful L2 users for developing theories
on the learning potential in SLA. Teachers College, Columbia University Working
Papers in TESOL & Applied Linguistics, 6(1). Retrieved March 26, 2006, from
http://journals.tc-library.org/index.php/tesol/article/view/164/162
Long, M. H. (1990). Maturational constraints on language development. Studies in
Second Language Acquisition, 12, 251285.
Long, M. H. (1993). Second language acquisition as a function of age: Research
Viberg (Eds.),
findings and methodological issues. In K. Hyltenstam & A.
Progression and regression in language (pp. 196221). Cambridge: Cambridge
University Press.
Long, M. H. (2005). Problems with supposed counter-evidence to the Critical Period
Hypothesis. International Review of Applied Linguistics, 43, 287317.
Language Learning 59:2, June 2009, pp. 249306

300

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

Long, M. H., & Robinson, P. (1998). Focus on form: Theory, research, and practice. In
C. Doughty & J. Williams (Eds.), Focus on form in classroom second language
acquisition (pp. 1641). Cambridge: Cambridge University Press.
MacKay, I. R. A., Flege, J. E., & Imai, S. (2006). Evaluating the effects of
chronological age and sentence duration on degree of perceived foreign accent.
Applied Psycholinguistics, 27, 157183.
Marinis, T. (2003). Psycholinguistic techniques in second language acquisition
research. Second Language Research, 19, 144161.
Marinova-Todd, S. H. (2003). Comprehensive analysis of ultimate attainment in adult
second language acquisition. Unpublished doctoral dissertation, Harvard
University, Massachusetts.
Markham, D. (1997). Phonetic imitation, accent, and the learner. Unpublished
doctoral dissertation. Lund: Lund University Press.
McAllister, R. (1997). Perceptual foreign accent: L2 users comprehension ability. In
A. James & J. Leather (Eds.), Second-language speech: Structure and process
(pp. 119132). Berlin: Mouton deGruyter.
McAllister, R., & Brodda, B. (2002). Development of a new speech comprehension
test with a phonological distance metric. Proceedings of Fonetik 2002, the XVth
Swedish Phonetics Conference, Stockholm, May 2931, 2002. Quarterly Progress
and Status Report (Dept of Speech, Music and Hearing and Centre for Speech
Technology, KTH, Stockholm) 44(1), 149151.
McDonald, J. L. (2000). Grammaticality judgments in a second language: Influences
of age of acquisition and native language. Applied Psycholinguistics, 21, 395423.
McDonald, J. L. (2006). Beyond the critical period: Processing-based explanation for
poor grammaticality judgment performance by late second language learners.
Journal of Memory and Language, 55, 381401.
McNamara, T. (2000). Language testing. Oxford: Oxford University Press.
Montrul, S., & Slabakova, R. (2003). Competence similarities between native and
near-native speakers. An investigation of the preterite-imperfect contrast in Spanish.
Studies in Second Language Acquisition, 25, 351398.
Moyer, A. (1999). Ultimate attainment in L2 phonology: The critical factors of age,
motivation and instruction. Studies in Second Language Acquisition, 21, 81108.
Munro, M., & Mann, V. (2005). Age of immersion as a predictor of foreign accent.
Applied Psycholinguistics, 26, 311341.
Neufeld, G. (1979). Towards a theory of language learning ability. Language
Learning, 29, 227241.
Neufeld, G. (2001). Non-foreign-accented speech in adult second language learners:
Does it exist and what does it signify? ITL Review of Applied Linguistics, 133134,
185206.
Obler, L. K. (1989). Exceptional second language learners. In S. Gass, C. Madden, D.
Preston, & L. Selinker (Eds.), Variation in second language acquisition
(pp. 141149). Clevedon, UK: Multilingual Matters.
301

Language Learning 59:2, June 2009, pp. 249306

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

Oyama, S. (1976). A sensitive period for the acquisition of a nonnative phonological


system. Journal of Psycholinguistic Research, 5, 261285.
Oyama, S. (1978). The sensitive period and comprehension of speech. Working Papers
on Bilingualism, 16, 117.
Patkowski, M. (1980). The sensitive period for the acquisition of syntax in a second
language. Language Learning, 30, 449472.
Piller, I. (2002). Passing for a native speaker: Identity and success in second language
learning. Journal of Sociolinguistics, 6, 179206.
Piske, T., MacKay, I. R. A., & Flege, J. E. (2001). Factors affecting degree of foreign
accent in an L2: A review. Journal of Phonetics, 29, 191215.
Platzack, C. (1973). Spraket och lasbarheten. Lund: GWK Gleerup.
Schachter, J. (1989). Testing a proposed universal. In S. Gass & J. Schachter (Eds.),
Linguistic Perspectives On Second Language Acquisition (pp. 7388). Cambridge:
Cambridge University Press.
Schneider, W., Eschman, A., & Zuccolotto, A. (2002a). E-Prime users guide.
Pittsburgh: Psychology Software Tools, Inc.
Schneider, W., Eschman, A., & Zuccolotto, A. (2002b). E-Prime reference guide.
Pittsburgh: Psychology Software Tools, Inc.
Scovel, T. (1988). A time to speak: A psycholinguistic inquiry into the critical period
for human speech. New York: Newbury House.
Seliger, H. W. (1978). Implications of a multiple critical periods hypothesis for second
language learning. In W. C. Ritchie (Ed.), Second language acquisition research:
Issues and implications (pp. 1119). New York: Academic Press.
Seliger, H., Krashen, S., & Ladefoged, P. (1975). Maturational constraints in the
acquisition of second languages. Language Sciences, 38, 2022.
Selinker, L. (1969). Language transfer. General Linguistics, 9, 6792.
Selinker, L. (1972). Interlanguage. IRAL, 10, 209231.
Sorace, A. (1993). Incomplete and divergent representations of unaccusativity in
nonnative grammars of Italian. Second Language Research, 9, 2248.
Sorace, A. (2003). Near-nativeness. In C. J. Doughty & M. H. Long (Eds.), The
handbook of second language acquisition. Oxford: Blackwell.
Sorace, A., & Robertson, D. (2001). Measuring development and ultimate attainment
in non-native grammars. In C. Elder, A. Brown, E. Grove, K. Hill, N. Iwashita, T.
Lumley, et al. (Eds.), Experimenting with uncertainty. Essays in honour of Alan
Davies (pp. 264274). Cambridge: Cambridge University Press.
Spolsky, B., Sigurd, B., Sato, M., Walker, E., & Arterburn, C. (1968). Preliminary
studies in the development of techniques for testing overall second language
proficiency. In J. A. Upshur & J. Fata (Eds.), Problems in foreign language testing.
Language Learning Special Issue (No. 3, pp. 7998). Ann Arbor, Mich.: Research
Club in Language Learning.
Stevens, G. (2006). The age-length-onset problem in research on second language
acquisition among immigrants. Language Learning, 56, 671692.
Language Learning 59:2, June 2009, pp. 249306

302

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

Stolten, K. (2005). Effects of age of learning on VOT in voiceless stops produced by


near-native L2 speakers. In A. Eriksson & J. Lindh (Eds.), Proceedings FONETIK
2005. The XVIII Swedish phonetics conference, May 2527 2005 (pp. 9194).
Goteborg: Department of Linguistics, Goteborg University.
Stolten, K. (2006). Effects of age on VOT: Categorical perception of Swedish stops by
near-native L2 speakers. In G. Ambrazaitis & S. Schotz (Eds.), Proceedings from
FONETIK 2006, Lund, June 79 2006 (pp. 125128). Lund: Centre for Languages
and Literature, (General Linguistics and Phonetics).
Taylor, W. L. (1953). Cloze procedure: A new tool for measuring readability.
Journalism Quarterly, 30, 414438.
Tsukada, K., Birdsong, D., Bialystok, E., Mack, M., Sung, H., & Flege, J. (2005). A
developmental study of English vowel production and perception by native Korean
adults and children. Journal of Phonetics, 33, 263290.
van Boxtel, S. (2005). Can the late bird catch the worm? Ultimate attainment in L2
syntax. Unpublished doctoral dissertation, Radboud University Nijmegen,
Utrecht.
van Boxtel, S., Bongaerts, T., & Coppen, P.-A. (2005). Native-like attainment of
dummy subjects in Dutch and the role of the L1. IRAL, 43, 355380.
Van Wuijtswinkel, K. (1994). Critical period effects in the acquisition of grammatical
competence in a second language. Nijmegen, The Netherlands: University of
Nijmegen.
White, L., & Genesee, F. (1996). How native is near-native? The issue of ultimate
attainment in adult second language acquisition. Second Language Research, 12,
233265.
Wray, A. (2002). Formulaic language and the lexicon. Cambridge: Cambridge
University Press.

Appendix A
English Translation of the First Newspaper Advertisement
in Metro (Stockholm Edition), September 16, 2002, p. 22:
Stockholm University
Centre for Research on Bilingualism
Is Spanish your mother tongue?
Subjects with Spanish as their first learned language wanted for a research
project on age and language acquisition
Research on human language and language acquisition has consistently shown
that people who begin their acquisition of a second language at an early age
eventually reach an ultimate attainment comparable to the proficiency levels of
303

Language Learning 59:2, June 2009, pp. 249306

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

native speakers, while people who begin their acquisition in adulthood typically
do not reach such levels of ultimate attainment. But the same research has also
shown that there are exceptions to thisthat is, persons who have begun their
second language acquisition relatively late in life (in their teens or later) and
still have reached a proficiency level comparable to that of native speakers.
We are currently looking for people who have begun the acquisition of
Swedish as a second language at varying agesfrom childhood to adulthood
and who have reached such a level of proficiency that native Swedish speakers
usually do not notice their non-Swedish mother-tongue background in everyday
communication.
The persons we are looking for must:
- have Spanish as their first learned language (even if they no longer use their
Spanish)
- be entirely fluent in Swedish, without exhibiting a noticeable foreign accent
or any obvious grammatical deviation
- be adults today (19 years or older)
- have lived in Sweden for at least 10 years
- have at least a high-school education
- have learned the variety of Swedish spoken in the greater Stockholm area
We are looking for people who are using both Spanish and Swedish on a
regular basis as well as people who already from the beginning have almost
exclusively used Swedish and only rarelyor neverSpanish (i.e. people who,
early or late in life, have had reasons to leave their mother tongue behind). In
other words, we are interested in functionally bilingual people as well as people
who have shifted language.
If you think that the above description fits with you, and if you are willing
to participate in a research project that potentially can offer interesting answers
to questions concerning the human language learning ability, please contact
us on the following telephone number: [number removed] (telephone hours
Mon, Wed, Fri at 10am2pm), or send an email to [assistants name and email
removed] with your name and telephone number. A small remuneration will be
given those who are eventually selected as participants for the study.
Welcome and call us!
Kenneth Hyltenstam
Professor

Language Learning 59:2, June 2009, pp. 249306

Niclas Abrahamsson
PhD

304

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

Appendix B
Eight examples out of 80 grammaticality judgment items, grouped by structure
type. (a) = grammatical sentences, (b) = ungrammatical sentences. Target
structures are underlined, and for the ungrammatical items, the correct structure
is given in [ ].
1. Subject-verb inversion (V2)
(a) Med tanke pa att den formogenhet familjen forfogade o ver var
ganska betydande forstar man deras negativa installning till dagens skattesystem.
Given that the fortune the family controlled was rather significant,
one understands their negative stance on todays tax system.
(b) Med tanke pa att den hogkonjunktur landet gick mot var mycket
tydlig man forstar [forstar man] kapitalagarnas uppfattning gallande
ekonomiska skyddstullar.
Given that the economic upturn the country was approaching was very
obvious, one understands the capitalists position regarding protectionist
tolls.
2. Reflexive possessive pronouns
(a) De a terkommande stamgasterna insag genast att deras restaurangbesok
inte skulle vara sig lika efter a garbytet.
The returning regular customers realized immediately that their visits
to the restaurant would not be the same after the change of owners.
(b) De mest rutinerade kroppsbyggarna sag till att sina [deras] benmuskler
utvecklades i samma takt som o vriga muskler.
The most experienced body builders made certain that their leg muscles
developed at the same rate as their other muscles.
3. Placement of sentence adverbs in relative clauses
(a) Flygplanet traffade en kraftledning som flygledningen inte fick in pa sin
skarm vilket var nara att orsaka en katastrof.
The plane hit a power line that the air-traffic controllers
could not pick up on their monitor, which nearly resulted in a catastrophe.
(b) Fartyget rammade en eka som styrmannen observerade inte [inte observerade] pa sin radar vilket fick katastrofala foljder.
305

Language Learning 59:2, June 2009, pp. 249306

Abrahamsson and Hyltenstam

AO and Nativelikeness in an L2

The ship rammed a rowboat that the helmsman hadnt noticed on his
radar, which had catastrophic consequences.
4. Adjective agreement in predicative position (example: AGR-Num, plural)
(a) Vardena som legat under det normala i flera veckor och darfor inte
rapporterats till myndigheterna var nu plotsligt starkt forhojda.
The levels that had been below normal for several weeks and therefore
not reported to the government authorities were now greatly increased.
(b) Skjulen som varit skymda av den hoga stenmuren och darfor inte existerat i folks medvetande blev nu helt blottlagd [blottlagda].
The sheds that had been hidden by the high stone wall, and therefore non-existent in peoples consciousness, were now suddenly entirely
exposed.

Language Learning 59:2, June 2009, pp. 249306

306

Anda mungkin juga menyukai