When does native language input affect phonetic perception? The precocious case of lexical tone
Highlights
► Lexical tone perception was tested in both 4- and 9-month-old infants. ► English-learners showed a decline in discrimination by 9 months. ► Chinese-learners discriminated tones at both ages, but only in some contexts. ► Mandarin- and Cantonese-learners also had language-specific patterns at both ages. ► Phonetic input affects tone perception earlier than vowel or consonant perception.
Introduction
It is widely acknowledged that infants begin perceiving phonetic contrasts in language-specific ways from 6 to 12 months of age. A commonly reported pattern is one of maintenance and decline, where young infants initially perceiving many native and non-native phonetic contrasts, but maintain sensitivity only to native contrasts as the perception of non-native contrasts declines (Anderson et al., 2003, Best et al., 1995, Bosch and Sebastián-Gallés, 2003, Cheour et al., 1998, Pegg and Werker, 1997, Polka and Werker, 1994, Rivera-Gaxiola et al., 2005, Werker and Lalonde, 1988, Werker and Tees, 1984a, Werker and Tees, 1984b). Of course, many other developmental patterns have now been reported. For example, if a particular consonant contrast is difficult to assimilate into the native language, like Zulu clicks into English, then infants as well as adults continue perceiving it well (Best and McRoberts, 2003, Best et al., 1988). Some types of vowel contrasts also remain acoustically salient throughout development (Polka & Bohn, 1996), due to language-universal biases in perception (Polka & Bohn, 2011). Other consonant contrasts are difficult to perceive in infancy, requiring language-specific input to learn (Narayan, Werker, & Beddor, 2010). Still more work shows that native language input does not just maintain perceptual sensitivity for some native phonetic contrasts, but also enhances it (Kuhl et al., 2006, Polka et al., 2001, Sundara et al., 2006, Tsao et al., 2006).
This work has built a rich understanding of how phonetic development unfolds, but remains limited in that it draws almost exclusively from the empirical study of phonetic segments (i.e., vowels and consonants). Less is known about how infants learn to perceive native prosodic contrasts—phonetic distinctions related to pitch (i.e., fundamental frequency [f0]), duration, and/or amplitude—likely because the use of these cues is relatively limited in Indo-European languages. One perspective is that prosodic contrasts develop on a similar schedule as most vowel and consonant contrasts. Language-specific perceptual patterns for many prosodic contrasts seem to emerge between 6 and 12 months of age, including lexical stress (Höhle et al., 2009, Jusczyk et al., 1993, Skoruppa et al., 2011, Skoruppa et al., 2009) as well as pitch accent (Sato, Sogabe, & Mazuka, 2009; see also Nazzi, Floccia, & Bertoncini, 1998). Nevertheless, a direct comparison with phonetic segments is difficult, as vowels and consonants occur within individual syllables, while lexical stress and pitch accent are defined across multiple syllables. A comparison between segments and those prosodic cues defined along similar timescales (i.e., within syllables) would provide a more closely matched comparison.
One such prosodic cue is phoneme duration (i.e., contrasts between short versus long vowels or single versus geminate consonants). Duration is used in many languages to distinguish words (e.g., Japanese, Arabic, Dutch, Berber), and developmental studies of its perception have shown very different developmental trajectories compared to those of vowels and consonants. For example, infants learning Japanese are only able to discriminate duration contrasts by 9–10 months of age, and have much more difficulty doing so at earlier ages (Sato et al., 2012, Sato et al., 2010). Moreover, cues to vowel duration are not used in language-specific ways until at least 18 months of age (Dietrich et al., 2007, Minagawa-Kawai et al., 2007, Mugitani et al., 2009), which suggests instead a markedly protracted developmental trajectory for prosodic cues relative to vowels and consonants.
Another such prosodic cue is lexical tone. Tones are primarily defined by f0 variations within single syllables, and are found in most of the languages native to the Americas, Sub-Saharan Africa, as well as East and Southeast Asia (Yip, 2002). For example, Mandarin uses four tones, each of which identifies a different word when spoken on the same syllable. In Mandarin, “ma” can mean mother (i.e., a high level tone), hemp (i.e., a rising tone), horse (i.e., a low dipping tone), or to scold (i.e., a falling tone), and thus tone constitutes an important factor in lexical retrieval for adults (Cutler & Chen, 1997).
Here we ask how the development of lexical tone perception unfolds in infancy. First we review tone perception in adults and in infants, which raises two important issues: how do infants converge on their native (versus non-native) tone system, and how does the developmental trajectory of tone perception compare to that of other phonetic units? Then we describe two experiments investigating these questions. Finally, we conclude by discussing the implications of this work for phonetic development more broadly.
Secondary acoustic cues to tone may include vowel duration (e.g., Blicher et al., 1990, Gandour and Harshman, 1978), or voice quality such as creaky voice (e.g., Gottfried & Suiter, 1997), but it is widely agreed that the two primary acoustic cues to tone are the variations in f0 level (e.g., high, middle, low), and/or f0 contour (e.g., steady, rising, falling) that occur within single syllables (Gandour, 1981, Gandour and Harshman, 1978, Khouw and Ciocca, 2007, Vance, 1976). Speakers of tone languages must negotiate the linguistic function of tones at segmental timescales with both foot- and phrase-level prosodic cues that have both grammatical (e.g., Price et al., 1991, Snedeker and Trueswell, 2002), and pragmatic functions in adults (e.g., Bock and Mazella, 1983, Dahan et al., 2002, Gussenhoven, 2004, Welby, 2003). Indeed, phonetic studies show that acoustic instantiations of tone strongly interact with the wider acoustic context, particularly carry-over from the preceding syllable and place within an utterance (see Xu (1999) for review). However, adult listeners are still adept at extracting identifiable f0 patterns in the face of this variability, particularly from the latter portion of the vowel, where canonical tone patterns are particularly robust (Khouw and Ciocca, 2007, Xu, 1997).
Learning to perceive non-native tone contrasts can be difficult for adult speakers of non-tone languages (Wang, Spence, Jongman, & Sereno, 1999), just as adults have trouble learning non-native vowel and consonant contrasts (Flege et al., 1997, Iverson and Evans, 2007, Lively, 1993, Pallier et al., 1997). Initial reports further suggested that speakers of a tone language are better at learning, identifying, and remembering all tone contrasts--even if the tones come from an unfamiliar language--than speakers of any non-tone language (Lee et al., 1996, Wayland and Guion, 2004). Later studies argued that a strict dichotomy between speakers of tone and non-tone languages is too simplistic, since non-native tone perception interacts in nuanced ways with the use of f0 in any language. For example, certain tone contrasts that are acoustically similar to phrase-level f0 distinctions can be quite easy for speakers of even non-tonal languages to distinguish (Francis et al., 2008, Hallé et al., 2004, So and Best, 2010, Wang et al., 2004).
There nevertheless remain important differences between the perception of f0 cues that signify post-lexical information such as grammatical distinctions, and the lexical information such as word identity (Braun & Johnson, 2011). Speakers of lexical tone languages show greater degrees of categorical perception for f0 cues compared to non-tone language speakers, and this happens in both speech and non-speech contexts (Francis et al., 2006, Hallé et al., 2004, Peng et al., 2010, Xu et al., 2006). Moreover, several ERP studies have suggested that the brain signatures of tone processing are speeded and are asymmetric (towards the left-hemisphere) for tone language speakers compared to non-tone language speakers, supporting the idea that exposure to a lexical system that uses tone will change the way that the perceptual system encodes f0 cues (Chandrasekaran et al., 2007, Chandrasekaran et al., 2007, Kaan et al., 2008, Kaan et al., 2007, Luo et al., 2006, Xi et al., 2010). This suggests, even though speakers of all languages must pay some attention to f0, that important differences at the level of f0 perception can be found between speakers of tone and non-tone languages.
For infants it seems especially challenging to identify linguistically important f0 variation when these same cues also mark affective and communicative functions in infant-directed speech (Fernald, 1989, Papoušek et al., 1990, Spence and Moore, 2003, Stern et al., 1982). Only a few papers have begun to ask how infants begin to successfully unscramble and identify f0 cues in word segmentation and word learning tasks (Bortfeld and Morgan, 2010, Quam and Swingley, 2010, Quam and Swingley, 2012, Singh and Foong, 2012, Singh et al., 2002, Singh et al., 2004, Singh et al., 2008), and even fewer reports (reviewed below) have examined the phonetic perception of tone in infancy.
Tsao (2008) showed that the perception of Mandarin tone contrasts in 10- to 12-month-olds was easiest for more acoustically distinct tones, and harder for those that were acoustically more similar, echoing previous studies with adults (e.g., So & Best, 2010). Only three other published studies have further compared across different ages or across language groups (Harrison, 2000, Mattock and Burnham, 2006, Mattock et al., 2008). These are discussed below in light of two central questions about the development of tone perception. First, how do infants begin learning native versus non-native tone systems? Second, is the trajectory of perceptual development similar for tones, vowels, and consonants?
A seminal study by Mattock and Burnham (2006) showed the typical pattern of maintenance and decline in speech perception: English-learning infants discriminated a Thai tone contrast at 6 but not at 9 months of age, while Chinese-learning infants (i.e., a mixed group of Cantonese and Mandarin learners) discriminated the tone contrast at both ages. These authors concluded that Chinese learners maintain perceptual sensitivity for tones from 6 to 9 months of age, even if those tones do not come from the native system. In contrast, English learners show a decline in attention to tones within this same period of development.
Recall, however, that adult researchers have recently argued that a simple dichotomy between speakers of tone versus non-tone languages is too simplistic. So and Best (2010) suggested that Cantonese speakers’ perception of Mandarin tones is at least partly influenced by patterns of assimilation into native tone categories (see also Best, 1995). Francis et al. (2008) suggested that tone perception is determined by language-specific acoustic weightings, invoking Gandour and colleagues’ (Gandour, 1983, Gandour and Harshman, 1978) identification of distinct perceptual dimensions for tone. Languages like Yorùbá make tonal distinctions based on the perceived height of the tone (i.e., level), while speakers of Mandarin mainly use the direction(s) of pitch change (i.e., direction). Speakers of other languages, like Thai or Cantonese, use a combination of both cues to distinguish tones (see also Vance, 1976). Francis et al. (2008) used multidimensional scaling techniques to show that English speakers weighted f0 level more heavily than f0 direction when making identifications of Cantonese tones, but that Mandarin speakers use the inverse weighting. Interestingly, both groups added weight to the other, unattended dimension after several training sessions.
Consider again the results of Mattock and Burnham (2006), who failed to find any differences between sub-groups of Cantonese- and Mandarin-learners in their infant sample. Two reasons may underlie this null result: first, their study was not designed to examine differences between Chinese sub-groups, and thus there may have been a lack of power when dividing the sample between Cantonese- and Mandarin-learners. A second possibility is that both groups of Chinese-learning infants may still discriminate many non-native tones by 9 months of age, but show preferences influenced by the native language, which could have been hard to detect using their methodology (i.e., a conditioned head-turn procedure). The first goal of the present study was to address these issues, using a more sensitive testing procedure to ask how the development of Cantonese tone perception unfolds among infants hearing tones de novo (English-learners), hearing non-native tones (Mandarin-learners), and hearing native tones (Cantonese-learners).
Previous work on infant tone perception has also raised the question of when language-specific influences emerge with respect to vowels and consonants. For example, language-specific perceptual patterns for vowels are reported as early as 6 months of age (Cheour et al., 1998, Kuhl et al., 1992, Polka and Werker, 1994), and continue to develop until at least 8–12 months of age (Bosch and Sebastián-Gallés, 2003, Polka and Bohn, 1996), while language-specific consonant perception is first seen by 8.5 months of age (Anderson et al., 2003), and continues emerging from 10–12 months of age (see Saffran, Werker, and Werner (2006) for review). This raises the question of whether tone perception is better aligned with vowels or with consonants in their developmental trajectory.
This was the topic of a recent report testing French- and English-learning infants at 4, 6, and 9 months of age on the same Thai tones used by Mattock and Burnham (2006) (Mattock et al., 2008). Crucially, both 4- and 6-month-old French- and English-learning infants were equivalently successful in discriminating tones, but 9-month-olds from either language group were not. This suggests that language-specific perception of tone does not emerge until at least 6 months of age, which is more similar to the development of consonant perception than to vowel perception. This is somewhat perplexing, given that tones are instantiated on the vocalic parts of syllables. Yet, as Mattock et al. (2008) point out, this finding is not surprising given other considerations. First, tones are classified differently from vowels in linguistic analysis (Yip, 1995, Yip, 2002), independently motivating developmental differences between vowels and tones. Second, the perception of other prosodic units (i.e., lexical stress) also seems to change in this age range: 6-month-olds show no behavioural preferences for language-dominant stress patterns, but 9- or 10-month-olds do (Höhle et al., 2009, Jusczyk et al., 1993, Turk et al., 1995; although see Friederici, Friedrich, & Christophe, 2007 for reports of language-specific stress perception by 4 months of age when measuring electrophysiological responses).
There are nevertheless some reasons to suspect that language-specific tone perception first emerges earlier than previously thought, as no study has yet compared learners of non-tone and tone languages when vowel perception is still considered language-independent (i.e., at 4 months of age). Harrison (2000), for example, reported a small study (n = 12) that suggests there may indeed already be cross-linguistic differences. Here, 6- to 8-month-old Yorùbá- and English-learning infants’ perception of f0 level was assessed, as this is the primary acoustic cue distinguishing tones from Yorùbá. By this age, Yorùbá-learning infants were already performing slightly better at f0 discrimination compared to English-learning infants. Moreover, previous work on tone perception has measured discrimination, while studies on the perception of other prosodic units, like lexical stress, often report infants’ preferences in addition to (or instead of) discrimination (Echols et al., 1997, Höhle et al., 2009, Jusczyk et al., 1993, Pons and Bosch, 2010, Turk et al., 1995). Although a preference for one language pattern over another is sufficient to imply discrimination, discrimination by itself does not necessarily imply a preference. Correspondingly, several reports show language-specific differences emerging in infancy that manifest as preferences (Höhle et al., 2009) or asymmetric discrimination patterns (Friederici et al., 2007, Kuhl et al., 1992), even as infants of all language backgrounds remain at least partially capable of discriminating these contrasts.
Here we asked whether language-specific patterns of tone perception emerge earlier than previously reported, using a testing procedure assessing both preference and discrimination in 4- and 9-month-olds learning either non-tone or tone languages. Mattock et al.’s (2008) previous procedure was adapted, where infants were given ‘alternating’ trials containing two unique tone types, as well as ‘non-alternating’ trials containing only one tone type. Differences in looking between these trials types implies discrimination of the stimuli (Best & Jones, 1998), and Mattock et al.’s (2008) implementation of this procedure suggested that infants would prefer looking at the alternating trial type when discriminating the tone contrast. Here we modified the procedure by giving each infant alternating trials containing the same syllables with either Tone X or Tone Y, non-alternating trials containing one tone (Tone X), and non-alternating trials containing the other tone (Tone Y). With this procedure we could simultaneously measure both discrimination (by observing any differences in looking across the three trial types), and preference (by measuring the pattern of preferences for alternating, non-alternating Tone X, and non-alternating Tone Y trials).
Here we tested infants learning English, Mandarin, and Cantonese. The latter two are officially considered dialects of Chinese, but it remains important to note that they are often considered different languages altogether (like Spanish and Portuguese) due to substantial differences at morphosyntactic, lexical, as well as phonological levels (including their tonal systems) that render them mutually unintelligible.
Tones are described here by an impressionistic notational system (Chao, 1968), where ‘1’ designates the lowest level of a speaker’s f0 range, and ‘5’ the highest. In this system, the first number denotes a starting f0 level and subsequent numbers denote inflection points in the f0 contour, or the ending f0 level. For example, Mandarin tones would be notated as a ‘high level’ tone 55, a ‘rising’ tone 25, a ‘falling’ tone 51, and a ‘low dipping’ tone 214. Cantonese, on the other hand, has six contrastive tones (Bauer & Benedict, 1997). Three of these are level tones, where f0 dips only slightly from beginning to end: a ‘high level’ tone 55, a ‘mid level’ tone 33, and a ‘low level’ tone 22. The other three Cantonese tones are contours: a ‘high rising’ tone 25, a ‘low rising’ tone 23, and a ‘low falling’ tone 21. Although citation forms of Cantonese and Mandarin tones differ dramatically in precise f0 range, length, and endpoints, there is also quite a bit of overlap between these two systems. Fig. 1 illustrates the four Mandarin tones and six Cantonese tones.
In Experiment 1 we tested 4- and 9-month-old English-learning infants’ perception of a Cantonese tone contrast. In Experiment 2, we ran an identical procedure on two groups of Chinese-learning infants: Mandarin-learners, for whom the tone contrast is non- native, and Cantonese-learners, for whom the contrast is native. Among the set of possible Cantonese contrasts, we chose the perception of the high rising tone (‘Tone 25’) versus the mid level tone (‘Tone 33’) for several reasons. First, this contrast would be the most similar to the Thai contrast used by Mattock and colleagues. Second, we could compare perceptual preferences in the Mandarin groups between one tone type easily assimilated to the native language (i.e., Tone 25 is very similar to the Mandarin rising tone) and one tone type that is not (i.e., Tone 33 is dissimilar to any Mandarin tone). Third, we could examine, within each group, infants’ relative preference for contour (i.e., Tone 25) versus level (i.e., Tone 33) tones.
Section snippets
Stimuli
Cantonese tones were instantiated on a CV syllable, pronounced “chee” (/ʨhi/), written as qi following commonly used Cantonese and Mandarin romanizations.
Experiment 2
The procedure and design from Experiment 1 was repeated with separate groups of 4- and 9-month-olds learning both Mandarin and Cantonese. In our non-native Mandarin group, infants heard a tone contrast that included one tone that could be assimilated to the native inventory (i.e., Tone 25), but another tone that was a relatively poorer match (i.e., Tone 33, which is lower than the only other level tone [55] in Mandarin).
General discussion
Research on infant speech perception has long focused on the development of consonants and vowels with a relative dearth of studies on other lexically relevant prosodic cues (i.e., tone, stress, pitch accent, and phoneme duration). The current study investigated tone perception in infancy, identifying two main questions to which we return below.
Conclusions
Speech perception develops remarkably quickly in infancy, as infants become attuned to the properties of the native language within a very short amount of time. The present study contributes to our understanding of this process in two main ways. First, this study shows, in more detail, how the perception of lexical tone develops in infancy. We replicated previous reports showing language-specific differences infant tone perception (Mattock and Burnham, 2006, Mattock et al., 2008), and further
Acknowledgments
This research was supported by a grant from the National Sciences and Engineering Research Council of Canada (81103) to J.F.W., a grant from the McDonnell Foundation (412783-001G) to Richard N. Aslin and J.F.W., and a fellowship from the Fondation Fyssen to H.H.Y. We thank Valter Ciocca for his helpful knowledge and advice on tone contrasts and tone perception, as well as on stimuli construction. We also thank Jacqueline Chong for help preparing stimuli, and recruiting subjects and Lawrence
References (141)
- et al.
Stimulus-alternation preference procedure to test infant speech discrimination
Infant Behavior and Development
(1998) - et al.
Divergent developmental patterns for infants’ perception of two nonnative consonant contrasts
Infant Behavior and Development
(1995) - et al.
Effects of syllable duration on the perception of the Mandarin Tone 2/Tone 3 distinction: Evidence of auditory enhancement
Journal of Phonetics
(1990) - et al.
Is early word-form processing stress-full? How natural variability supports recognition
Cognitive Psychology
(2010) - et al.
Question or tone 2? How language experience and linguistic function guide pitch processing
Journal of Phonetics
(2011) - et al.
Mismatch negativity to pitch contours is influenced by language experience
Brain Research
(2007) The development of infants’ preference for motherese
Infant Behavior and Development
(1997)- et al.
The periodicity bias
Journal of Phonetics
(1993) - et al.
Accent and reference resolution in spoken-language comprehension
Journal of Memory and Language
(2002) - et al.
The perception of rhythmic units in speech by infants and adults
Journal of Memory and Language
(1997)
The equivalence of cues in the perception of speech by infants
Infant Behavior and Development
Four-month-old infants prefer to listen to motherese
Infant Behavior and Development
Acoustic determinants of infant preference for motherese speech
Infant Behavior and Development
Effects of experience on non-native speakers’ production and perception of English vowels
Journal of Phonetics
Perceptual learning of Cantonese lexical tones by tone and non-tone language speakers
Journal of Phonetics
Brain responses in 4-month-old infants are already language specific
Current Biology
Tone perception in Far Eastern languages
Journal of Phonetics
Learning phonetic categories by tracking movements
Cognition
Effect of linguistic experience on the identification of Mandarin Chinese vowels and tones
Journal of Phonetics
Identification and discrimination of Mandarin Chinese tones by Mandarin Chinese vs. French listeners
Journal of Phonetics
Acquiring the phonology of lexical tone in infancy
Lingua
Language specific prosodic preferences during the first half year of life: Evidence from German and French infants
Infant Behavior & Development
Effects of native language and training on lexical tone perception: An event-related potential study
Brain Research
Discrimination of polysyllabic sequences by one- to four-month-old infants
Journal of Experimental Child Psychology
Perceptual correlates of Cantonese tones
Journal of Phonetics
Speaker normalization in perception of lexical tone
Journal of Phonetics
Phonological specificity of vowels and consonants in early lexical representations
Journal of Memory and Language
The developmental course of lexical tone perception in the first year of life
Cognition
Cerebral lateralization and early speech acquisition: A developmental scenario
Developmental Cognitive Neuroscience
Use of phonetic specificity during the acquisition of new words: Differences between consonants and vowels
Cognition
Discrimination of pitch contours by neonates
Infant Behavior & Development
Perception and acquisition of linguistic rhythm by infants
Speech Communication
A limit on behavioral plasticity in speech perception
Cognition
Infant responses to prototypical melodic contours in parental speech
Infant Behavior and Development
The meanings of melodies in motherese in tone and stress languages
Infant Behavior and Development
The influence of language experience on categorical perception of pitch contours
Journal of Phonetics
Asymmetries in vowel perception
Speech Communication
Natural Referent Vowel (NRV) framework: An emerging view of early phonetic development
Journal of Phonetics
Structural generalizations over consonants and vowels in 11-month-old infants
Cognition
Phonological knowledge guides two-year-olds’ and adults’ interpretation of salient pitch contours in word learning
Journal of Memory and Language
A statistical basis for speech sound discrimination
Language and Speech
Modern Cantonese phonology
An investigation of young infants’ perceptual representations of speech sounds
Journal of Experimental Psychology: General
Infant perception of non-native consonant contrasts that adults assimilate in different ways
Language and speech
Examination of perceptual reorganization for nonnative speech contrasts: Zulu click discrimination by English-speaking adults and infants
Journal of Experimental Psychology: Human Perception and Performance
A direct realist view of cross-language speech perception
Intonational marking of a given and new information: Some consequences for comprehension
Memory & Cognition
Linguistic constraints on statistical computations: The role of consonants and vowels in continuous speech processing
Psychological Science
Simultaneous bilingualism and the perception of a language-specific vowel contrast in the first year of life
Language and Speech
Neuroplasticity in the processing of pitch dimensions: A multidimensional scaling analysis of the mismatch negativity
Restorative Neurology and Neuroscience
Cited by (145)
Tonal interference in word learning? A comparison of Cantonese and French
2024, Journal of Experimental Child PsychologyDifferences between performance on phonological awareness and prosodic stress awareness tasks in school children with Developmental Language Disorder
2022, Revista de Logopedia, Foniatria y AudiologiaTowards a model of language neurobiology in early development
2022, Brain and LanguagePerceptual narrowing in face- and speech-perception domains in infancy: A longitudinal approach
2021, Infant Behavior and Development
- 1
Present address: School of Audiology and Speech Sciences, University of British Columbia, 2177 Wesbrook Mall, Vancouver, BC, Canada V6T 1Z3.