Native language background affects the perception of duration and pitch

Estonian is a quantity language with both a primary duration cue and a secondary pitch cue, whereas Chinese is a tonal language with a dominant pitch use. Using a mismatch negativity experiment and a behavioral discrimination experiment, we investigated how native language background affects the perception of duration only, pitch only, and duration plus pitch information. Chinese participants perceived duration in Estonian as mean-ingless acoustic information due to a lack of phonological use of duration in their native language; however, they demonstrated a better pitch discrimination ability than Estonian participants. On the other hand, Estonian participants outperformed Chinese participants in perceiving the non-speech pure tones that resembled the Estonian quantity (i.e., containing both duration and pitch information). Our results indicate that native language background affects the perception of duration and pitch and that such an effect is not specific to processing speech sounds.


Introduction
The perception of acoustic information like duration and pitch is affected by a variety of factors, including one's life-long experience of the native language (L1).Around 60-70 % of the world's languages are tonal (e.g., Mandarin Chinese, Thai), i.e., they use the alternation of pitch contours to discriminate word meanings (Yip, 2002).There is ample evidence that speakers of tonal languages are better at discriminating pitch in the non-native language and non-speech sounds due to extensive exposure to pitch information in their L1 (e.g., Bidelman et al., 2013;Chandrasekaran et al., 2007;K. Yu et al., 2019).Whereas Mandarin Chinese (Mandarin or Chinese hereafter) is a tonal language where pitch is dominantly used to discriminate word meanings (Lin, 2007), Estonian is a quantity language that features both a primary duration cue and a secondary pitch cue (Lippus, 2011;Lippus et al., 2009).In the present study, we will investigate how Chinese and Estonian native speakers perceive duration and pitch in foreign languages and how native language background affects the processing of non-speech sounds.
In addition to the temporal features, pitch has been shown to play an important role in the Estonian quantity distinction.Whereas in Q1 and Q2, there is a step-down in the F0 contour between the end of the first syllable nucleus and the beginning of the second syllable, Q3 is associated with a fall early during the first syllable (Liiv, 1961;Lippus, 2011;Lippus et al., 2009).The falling pitch in the first syllable is crucial for perceiving Q3 among Estonian native speakers.As shown by Lippus et al. (2009), when the S1/S2 ratio was greater than 2, Q3 was perceived only if the pitch was falling in the first syllable, as is typical for Q3; if the pitch was flat in the first syllable, as is typical for Q1 and Q2 words, the Estonian group identified Q3 or Q2 at chance levels.
Duration in Mandarin is not used phonologically, i.e., it does not distinguish word meanings.Whether duration influences the categorical perception of Chinese tone remains a controversial issue.Whereas some studies showed that native Chinese speakers benefit from the duration cue to categorically perceive the lexical tone (Blicher et al., 1990;S. Chen et al., 2017), others suggested no significant duration effect on the perception of lexical tones among young Chinese adults (Chang, 2011;Feng & Peng, 2018;D. Wang & Peng, 2012;Y. Wang et al., 2017).

Effect of native language background on the perception of non-native languages
Language experience has been found to affect the perception of nonnative languages.Many studies adopted the mismatch negativity (MMN), an electrocortical response elicited by an unattended change in the sensory input (Näätänen, 1992;Näätänen et al., 1997Näätänen et al., , 2007)), as an index of the brain's ability to automatically detect the difference between the frequent/standard and the rare/deviant stimuli (e.g., Jacobsen et al., 2004).A recent study reported larger MMNs among Estonian speakers than Russian speakers who had little or no experience in the Estonian language in response to the duration change in the Estonian stimuli, as the former has long-term familiarity with the stimuli as meaningful words in their L1 (Kask et al., 2021).Similarly, it is found that native speakers of Finnish, a language where vowel duration is crucial for phoneme distinction, showed larger MMNs than non-native Russian speakers (Ylinen et al., 2006) and second-language users of Finnish (Nenonen et al., 2003) in response to the vowel duration in Finnish speech sounds.At the behavioral level, Lehiste and Fox (1992) found that Estonian speakers identified a longer token as "prominent" more frequently than English speakers, suggesting that speakers of a quantity language are more sensitive to differences in duration.More recently, Lee et al. (2022) found that Japanese speakers were significantly better at discriminating Estonian duration contrasts than Cantonese speakers.Whereas Cantonese speakers use duration to a limited extent in their L1, having systematic two-way duration contrasts in their L1 allows Japanese speakers to outperform Cantonese speakers in discriminating the Estonian quantity.Comparing native speakers of Estonian, Swedish, Finnish, and Mandarin, Šimko et al. (2015) found that Mandarin speakers, whose L1 lacks vowel duration contrasts, made less precise judgments on the length of the sounds than the other three groups.
Regarding pitch perception, Chandrasekaran et al. (2007) found larger MMNs for the Chinese group than the English group when they discriminated the Chinese T1 and T3.The results suggest that the relative saliency of acoustic dimensions in a particular language shapes early cortical processing of pitch contours.In another MMN experiment that used cross-category and within-category Chinese tones as the deviants, Shen and Froud (2019) found that among native Mandarin speakers, the cross-category deviant elicited a larger MMN over the left hemisphere relative to the within-category deviant.However, the categorical perception of Chinese tone was not found among native speakers of a non-tonal language like English.A native-like categorical perception was also found in subjects with a tonal L1 but not in subjects with a non-tonal L1, suggesting that categorical perception of non-native lexical tone is modulated by participants' L1 experience (K.Yu et al., 2019).At the behavioral level, Tsukada and Kondo (2019) found that while native Mandarin speakers outperformed Australian speakers and Burmese speakers in perceiving all Mandarin tonal contrasts, speakers of Burmese who use lexical tones in their L1 did not necessarily outperform speakers of a non-tonal language like Australian English.The perception of non-native contrasts may be interfered with by the native prosodic system.For example, Hao (2012) found that Cantonese subjects had difficulty distinguishing Mandarin T1 and T4, which could be attributed to the fact that they tended to perceptually assimilate the Mandarin T1 and T4 to overlapping Cantonese tonal categories (see Best, 2019;J. Chen et al., 2020).
Researchers have proposed different theoretical accounts to explain the L1 effect on the perception of suprasegmental information (e.g., duration, pitch, loudness) in non-native languages.One is the Feature Hypothesis (McAllister et al., 2002).Based on the results of the perception of duration in Swedish quantity, McAllister et al. (2002) proposed that features not used to signal phonological contrast in L1 will be difficult for foreign language learners to perceive.The Perceptual Assimilation Model (PAM) proposes that when non-native tones clearly assimilate to separate native tone types ('two-category assimilation'), they are perceptually distinct and easy to perceive, but when they assimilate to a single tone category ('single-category assimilation'), they may be confusing and difficult to perceive (Best, 2019;J. Chen et al., 2020).Schaefer and Darcy (2014) proposed in their Functional Pitch Hypothesis that L1 pitch status shapes the perception of a non-native tone system.The model proposes a perceptual advantage for speakers from higher L1 pitch status, suggesting that the functionality of linguistic pitch to signal lexical contrast in the L1 shapes the perception of non-native pitch in a gradient fashion.
Estonian is a distinct language where both duration and pitch are linguistically relevant, and it involves using the two cues simultaneously.The previous theoretical models have mostly focused on a single feature (e.g., duration or pitch) in foreign language perception.While it is relatively clear that Estonian duration will bring extra difficulty for Chinese speakers because duration is not phonologically used in Chinese (Feature Hypothesis;McAllister et al., 2002) and that Chinese pitch will be difficult for Estonian speakers to perceive because pitch in Estonian is less prominent than that in Chinese (Fuctional Pitch Hypothesis; Schaefer & Darcy, 2014), it remains unclear how speakers of different L1 will perceive information that contains both duration and pitch simultaneously (e.g., an Estonian Q2-Q3 contrast).For example, when perceiving the Estonian Q2-Q3 contrast, would the pitch difference be enough for Chinese participants to discriminate the two stimuli (and therefore, Chinese participants perform similarly to Estonian participants), or would the duration difference cause confusion in their discrimination (and therefore, Chinese participants perform worse than Estonian participants)?The present study will include a condition where duration and pitch cues are simultaneously used.We hope to provide novel insights into the existing literature on how L1 affects the perception of non-native languages.

Influence of native language background on the perception of nonlinguistic stimuli
Ample evidence has shown the effect of language experience on the perception of auditory signals in non-speech sounds.Using pure tones as the stimuli, Tervaniemi et al. (2006) found larger MMN responses in Finnish native speakers, whose L1 features a duration contrast, relative to German speakers, whose L1 does not have a duration cue in it, in response to the linguistically relevant duration change, but not in response to the linguistically irrelevant frequency change.Similar results were found by Kirmse et al. (2008), where Finns showed shorter MMN latencies, thus higher sensitivity, than Germans when perceiving duration change but not frequency change in harmonically complex tones.Marie et al. (2012) manipulated the duration of harmonic sounds and found enhanced MMNs among Finnish speakers compared to French speakers whose L1 does not have duration contrasts.Giuliano et al. (2011) found earlier neural responses to the pitch change in pure tones among Chinese speakers than participants without exposure to a tonal language, suggesting that using tonal pitch contours in L1 leads to a general enhancement in the acuity of pitch representations.Behaviorally, Bidelman et al. (2013) showed that speakers of Cantonese, a tonal language, outperformed (non-musician) speakers of English, a non-tonal language, in basic auditory as well as complex music perception.The results suggest that a background in tonal language is associated with higher auditory perceptual performance when listening to music.
While the majority of previous studies used musical items, i.e., auditory signals in a completely different domain than language, to examine the cross-domain effect of language experience, it is unclear how language-based non-speech sounds are processed.For example, we can create certain non-speech sounds that "take away" segmental information (i.e., consonants and vowels, the language part of a sound) but keep the word's other acoustic features (e.g., syllable ratio, pitch contour).The stimulus created in this way is not language in nature because it does not contain consonants or vowels, but it resembles the physical features of the corresponding speech sounds.Whether such languagebased non-speech sounds will be processed similarly to the corresponding speech sounds remains to be explored.Using language-based non-speech sounds will allow us to directly compare the speech and non-speech stimuli and examine how "far" language can affect the perception in the non-speech domain.

Hemisphere lateralization in the perception of native and non-native languages
Hemisphere lateralization has been examined in relation to native and non-native perceptions of language.For example, Pulvermüller et al. (2001) found larger MMNs elicited by Finnish words than pseudowords among Finnish native speakers but not among foreigners who did not know any Finnish.Whole-head magnetoencephalography on Finns showed that such word-related MMN was located in the left hemisphere.Using pseudoword as the stimuli, Kirmse et al. (2008) found a leftward shift of the MMN scalp distribution for changes in vowel duration among Finns but not Germans as a result of the extensive linguistic experience with phonetically distinctive vowel duration contrasts in Finnish but not in German.Comparing Chinese and English speakers' perceptions of Chinese tone and intonation, Gandour et al.'s (2004) fMRI study suggests speech prosody perception is mediated primarily by the right hemisphere but is left-lateralized when language processing is required beyond the acoustic analysis of the complex sound.In a study on native Chinese speakers, Xi et al. (2010) found that the acoustic information in Chinese tone (e.g., within-category deviant) is processed in the right hemisphere, but the processing of phonological information in Chinese tone (e.g., cross-category deviant) is leftlateralized.Behaviorally, Y. Wang et al.'s (2001) dichotic listening experiment on the perception of Chinese tone showed a significant right ear advantage among Mandarin speakers but no ear preference among American English native speakers with no tonal knowledge.The results suggest that Mandarin tones are processed in the left hemisphere by native Mandarin speakers but are bilaterally processed by English speakers.
It can be seen that phonological (meaning-related) information or native language is usually processed in the left hemisphere, but acoustic or non-native speech sounds are usually processed in the right.Although electroencephalography (EEG) has been classically considered to have a low spatial resolution (Cao et al., 2021), it has been practiced to group electrodes in each hemisphere to give information about the hemispheric difference during language processing (e.g., Coulson & Kutas, 2001;Xi et al., 2010).In the present study, we will use lateralization as an index of phonetic/acoustic or phonological processing.If certain stimuli elicited right-lateralized MMN, it should be informative that they are perceived as acoustic rather than phonological information.

The present study
The present study investigates how native language background affects the perception of duration and pitch in foreign speech sounds and how the effect can be transferred to the non-speech domain.We will focus on Estonian, a quantity language where both duration (primary) and pitch (secondary) cues exist in its phonological system, and Mandarin Chinese, a tonal language where pitch is a primary acoustic feature, and recruit native speakers of both languages.We will use MMN as a tool to investigate the brain's automatic processing of information, followed by a behavioral AX discrimination experiment that is expected to provide supplemental information about auditory processing after the automatic process.
Despite a vast amount of literature on the effect of native language background on the perception of foreign languages and non-speech sounds, the present study is novel in the following aspects.First, we will include a language where both duration and pitch are linguistically relevant, i.e., Estonian.The effect of native language background on foreign language perception has been dominantly demonstrated by comparing tonal and non-tonal language speakers' perception of lexical tone in East or Southeast Asian languages (e.g., Mandarin, Cantonese, Thai) (e.g., Chandrasekaran et al., 2007;Hao, 2012).While there have been a few studies on the perception of the Estonian quantity among speakers with different language backgrounds, studies that examine both the pre-attentive and attentive processing of the Estonian quantity are still limited.Second, in addition to examining the duration and pitch information individually, we will include a condition that simultaneously examines the duration and pitch features.The combination of duration and pitch is a distinct feature of the Estonian language.The perception of Estonian quantity depends on the existence of both duration and pitch cues (Lippus et al., 2009).Simultaneously investigating duration and pitch features provides novel insights into how lifelong experience with the native language affects the brain's perception of specific sound features.Third, we will use language-based pure tones to examine how native language background affects the perception of non-speech sounds.We will create pure tone stimuli that remove words' segmental information but keep the other acoustic features.Using pure tones created this way allows us to zoom into the nonspeech domain and observe subtle effects of native language background, if there are any.
Cross-language differences in speech perception depend on the specific dimensions of sound features that native speakers are exposed to in natural speech contexts.We hypothesize that long-term familiarity with duration and/or pitch in L1 facilitates their processing.Specifically, regarding the perception of the Estonian stimuli, we expect Estonian native speakers to outperform Chinese native speakers in perceiving the duration-only change and the duration plus pitch change, given that duration plays a primary role in the Estonian language and that the combination of duration and pitch is a prevalent feature in the Estonian quantity system (Feature Hypothesis;McAllister et al., 2002).The pitchonly condition in the Estonian stimuli does not contain typical information for either the Estonian or Chinese group, and therefore, it is hard to make clear predictions about which group will be better at perceiving it.On the other hand, regarding the perception of the Chinese stimuli, we expect Chinese native speakers to outperform Estonian native speakers in perceiving the pitch-only change given that pitch is more prominent in the Chinese language than in the Estonian language (Functional Pitch Hypothesis; Schaefer & Darcy, 2014).It is hard to predict the duration-only and duration plus pitch conditions in the Chinese stimuli because the information in the two conditions is equally foreign to both groups (i.e., Chinese words carrying primary features of Estonian quantity).For both the Estonian and Chinese stimuli, if native language background has an effect on processing non-speech sounds, then we should find similar group differences in the processing of the pure tone stimuli.If certain acoustic information is perceived as phonetical rather than phonological, we should find the MMN to be rightlateralized (e.g., Gandour et al., 2004).

Participants
A total of 124 participants were recruited, including 61 Chinese native speakers in China (mean age 22.1 years; age range 18-25 years; 28 males, 33 females) and 63 Estonian native speakers in Estonia (mean age 21.2 years; age range 18-26 years; 28 males, 35 females).One-third of the participants were selected for our other study (Lyu et al., 2024) as a control group by matching their gender and age to the musicians sample in that study.Since the present study investigates different research questions from Lyu et al. (2024), the data of those participants were kept here.All participants had self-reported normal hearing and normal or converted-to-normal vision.The Chinese participants had no self-reported exposure to the Estonian language or any other languages that adopt duration to differentiate word meanings, and the Estonian participants had no self-reported exposure to the Chinese language or any other tonal languages.All participants were right-handed (N=123) or converted to right-handed (N=1, Chinese subject) and lived in their native country before age seven.No participants had self-reported previous musical education or neurological or psychiatric disorders.
The experiment was approved by the Institutional Review Board, College of Foreign Languages, Zhejiang University of Technology in China, and the Research Ethics Committee at the University of Tartu in Estonia, respectively, and was performed following relevant named guidelines and regulations.All participants gave their written informed consent before the beginning of the experiment and received a gift card as compensation after the experiment.

Stimuli
The Estonian stimuli were Q2 and Q3 Sada words, i.e., saada 'to send' (Q2) and saada 'to get' (Q3).The stimuli were created based on a previous study by Lippus et al. (2009).Following Lippus et al. (2009), we recorded the two base words in a carrier sentence Ütle Sada palun 'Say Sada please' so that the words were pronounced in a natural context with the correct quantity level (i.e., word meaning).The base words were recorded by a female native Estonian speaker at a sampling rate of 44.1 kHz.The duration of the vowel of the first syllable (V1) of the two base words was manipulated, starting from 50 ms in nine steps of 30 ms to 290 ms.The consonant of the first syllable (C1), the consonant of the second syllable (C2), and the vowel of the second syllable (V2) remained unmanipulated.According to Lippus et al.'s (2009) results, Estonian speakers perceived Q2-based stimuli as Q2 the most when V1 had a duration of 170 ms and Q3-based stimuli as Q3 the most when V1 had a duration of 290.We therefore selected the 170 ms and 290 ms manipulations from the sets of stimuli we created for the present study.To make it comparable across conditions, we included the 170 ms and 290 ms of both Q2 and Q3, resulting in four Sada words in total.The two Q3based words had a steeper falling pitch slope on V1 than the two Q2based words (Table 1).As shown by Lippus et al. (2009), the Q2-170 word was typically perceived as Q2, and the Q3-170 and Q3-290 words were typically perceived as Q3 by Estonian native speakers.It is worth noting that Q2-290 in the Lippus et al. (2009) study was perceived as Q2 or Q3 at chance level by Estonian native speakers due to having a large S1/S2 ratio but lacking the pitch cue.Nevertheless, since all the Estonian stimuli were foreign (i.e., pseudowords) to Chinese participants, whether Q2-290 was a typical Q2 or Q3 did not affect the interpretation of the results in the present study.
The Chinese stimuli were T1 and T2 Jidi words, i.e., ji1di4 'base' (T1) and ji2di4 'polar area' (T2).In Mandarin Chinese, each syllable is a tonebearing unit (Lin, 2007).In connected speech, the basic shapes of Mandarin tones often undergo context-induced modifications (X. S. Shen, 1990;Xu, 1994).To avoid coarticulation from neighboring words, Note.The stimuli in bold were used as standard in the MMN experiment.The numbers in parentheses indicate the pitch change on V1.For example, the V1 in Q2-170 had an onset pitch of 208 Hz and an offset pitch of 188 Hz. a As one of our reviewers pointed out, manipulating the V1 duration in a disyllabic context brings the ambiguity that at the duration change point (i.e., 260 ms for Sada and 240 ms for Jidi), the change can be interpreted as either a duration change (short V1 vs. long V1) or segmental change (C2 from 261 or 241 ms for the short stimuli, but still V1 for the long stimuli).Since the duration feature always occurs in a disyllabic context in Estonian quantity (Lehiste, 1997;Lippus et al., 2009), and in light of a number of previous studies (e.g., Kirmse et al., 2008;Sadakata & Sekiyama, 2011;Ylinen et al., 2005Ylinen et al., , 2006)), we kept the current manipulation.We will discuss the perception of duration later in the Discussion.
we recorded the two Chinese base words in isolation. 1The two base words were recorded by a female native Chinese speaker, who was informed of the meaning of the two words, at a sampling rate of 44.1 kHz.The base words were re-synthesized to match the four versions of the Estonian stimuli.The duration of each segment was manipulated to 90 ms (C1), 150/250 ms (V1), 40 ms (C2), and 240 ms (V2), resulting in four Jidi words in total (Table 1).To isolate the pitch contour and keep the rest of the acoustic features identical, we extracted the pitch tier of ji1di4 and then replaced the pitch of ji2di4 with the extracted pitch tier to create a new ji1di4.As a result, the two Jidi words (i.e., the new ji1di4 and ji2di4) were identical except for the pitch contour.Whereas the two T2-based words had a rising pitch on V1, the two T1-based words had a relatively flat pitch on V1.Four native Mandarin speakers who did not participate in the experiments listened to the manipulated words and were able to identify T1-150 and T1-250 as ji1di4 'base' and T2-150 and T2-250 as ji2di4 'polar area.'.
To create language-based pure tones, we extracted the pitch contours of the Sada words and the Jidi words and created sinewaves from the extracted pitch contours.The created pure tone stimuli thus resembled the physical features (e.g., duration of each segment and pitch contour) of their corresponding words, except that they did not carry any consonants or vowels.This manipulation allows us to compare the nonspeech and speech stimuli more directly.The pure tone stimuli were normalized to a frequency of 44.1 kHz.All the word and pure tone stimuli were normalized to an intensity of 80 dB.The recordings and manipulations of all stimuli were performed in the Praat software (Boersma, 2001).The physical features of all stimuli are shown in Supplementary Figure S1.
Each set of stimuli, i.e., Sada word, Sada pure tone, Jidi word, and Jidi pure tone, was used as an independent MMN series in the EEG experiment.The EEG experiment adopted the multi-feature paradigm (Lyu et al., 2024;Näätänen et al., 2004;Schröger, 1995;Tervaniemi, 2022) such that one stimulus was standard and several different deviants were presented in between.For the series of Sada word and Sada pure tone, Q2-170 was used as standard and the other three as deviants; for the series of Jidi word and Jidi pure tone, T1-150 was used as the standard and the other three as deviants.It is worth noting that in the series of Sada word and Sada pure tone, the Q3-based deviants (i.e., Q3-170 and Q3-290) had different lengths of second syllables than the standard Q2-170 as we kept the C1, C2, and V2 of the base words unmanipulated.However, since the focus of the present study is the MMN elicited by the difference in V1, the differences in the second syllable and the prolonged MMNs they might elicit will not be analyzed.Each MMN series thus contained three types of V1-based deviants: one deviant that differed from the standard in terms of duration only (the Duration condition, Sada Q2-290 and Jidi T1-250), one deviant that differed from the standard in terms of pitch only (the Pitch condition, Sada Q3-170 and Jidi T2-150), and one deviant that differed from the standard in both duration and pitch (the DurPitch condition, Sada Q3-290 and Jidi T2-250).
Each MMN series started with 15 repetitions of the standard to create memory traces, after which every other standard was followed by a different deviant.The deviants were randomized such that no two consecutive deviants were the same.Each deviant was repeated 100 times.Together, in each MMN series, the participant was presented 315 times standard and 300 times deviants (i.e., 100 times per deviant type).It has been suggested that using physically different standard and deviant stimuli in the same block underestimates genuine MMN due to ERP reflections of differential processing of the two physically different stimuli (Jacobsen & Schröger, 2001, 2003).With the current design, the deviant-minus-standard ERP differences could possibly be a sum contribution from MMN and N1 (Horváth et al., 2008;Näätänen et al., 2005).Although it is less likely that the pure, genuine MMN will be elicited, the deviant-minus-standard difference waves still reflect physical differences between the standard and the deviant and can be considered an indicator of the brain's discrimination ability of the auditory input.
In the behavioral AX discrimination experiment, only the non-native stimuli were included due to the length of the whole experiment.Sada word and Sada pure tone were presented to Chinese participants and Jidi word and Jidi pure tone to Estonian participants.Trials were presented in pairs in the AX experiment (see Procedure).For both word and pure tone stimuli, there were four same pairs, each being repeated 15 times, and 12 different pairs (i.e., all combinations of different pairs), each being repeated five times.Together, there were 120 pairs of word stimuli and 120 pairs of pure tone stimuli for each subject.The stimuli were divided into four blocks, including 60 pairs of non-native words, 60 pairs of non-native pure tones, 60 pairs of non-native words, and 60 pairs of non-native pure tones.

Procedure
For the experiment in China, participants filled out a background questionnaire that asked for their demographic information, language skills, relevant medical conditions, musical experience, and handedness on paper after arriving in the lab, and then they went through the EEG experiment and the behavioral experiment.The EEG experiment consisted of four series, each lasting 10-12 min.The stimuli in each series were presented using customized MATLAB (MathWorks, Natick, MA, United States) scripts with the random inter-stimulus interval (ISI) of 400, 425, or 450 ms.All stimuli were presented to participants through Shure SE112 insert earphones (Shure Inc., Niles, IL, United States) at a fixed range of volumes across participants.During the EEG experiment, participants were presented with a silent movie with Chinese subtitles on a 10.5-inch iPad Pro (Apple Inc., Cupertino, CA, United States).The order of the series was randomized among participants.Participants were given several breaks between series.
The behavioral experiment was performed in E-prime 3 (Psychology Software Tools, Pittsburgh, PA, United States).Participants listened to two sounds in a row at a fixed ISI of 300 ms, after which they were asked whether the two sounds were the same or different.Participants were instructed to answer upon the completion of the second audio using the keyboard (F represented "same" and J represented "different"), and they had 1000 ms to respond.The audio stimuli were presented to participants through Shure SE112 insert earphones (Shure Inc., Niles, IL, United States).The experiment started with 10 practice trials with a non-word tada to familiarize the participant with the procedure, after which the experimental trials were presented in four blocks (see Stimuli).The word block and the pure tone block were placed one after the other (i.e., there were no two consecutive word blocks or pure tone blocks).Block sequence was counterbalanced across subjects.The trials within each block were pseudo-randomized.Participants were given several breaks between blocks.The whole behavioral experiment lasted for about 15 min.
For the experiment in Estonia, participants received a link to the background questionnaire that asked for their demographic information, language skills, relevant medical conditions, musical experience, and handedness and finished the questionnaire before coming to the lab.In the lab, the participants' self-reported normal hearing was first confirmed by audiometry, after which they went through the EEG experiment and the behavioral experiment of the current study, and an additional dichotic listening task from our other project that was not reported here.The EEG experiment followed the same procedure as described for the experiment in China, except that the silent movie was played with Estonian subtitles on a Mitsubishi Diamond Pro 2070SB 22inch computer screen (Mitsubishi Electric, Tokyo, Japan).The audio stimuli in the EEG and behavioral experiments were presented with 1 Since we will not directly compare the results of the Estonian stimuli and the Chinese stimuli, the different methods used for recording the stimuli in the two languages should not affect the observed effects and their interpretation.Shure SE112 insert earphones (Shure Inc., Niles, IL, United States) at a fixed range of volumes across participants.The behavioral experiment was performed in E-prime 2 (Psychology Software Tools, Pittsburgh, PA, United States) following the same procedure as in the experiment in China.

EEG recording
For the experiment in China, EEG was recorded with the actiCHamp 64-channel system (Brain Products GmbH, Munich, Germany) at the sampling rate of 500 Hz with the online reference FCz.No online filter was applied during the recording.Five electrodes, including four in the posterior areas that were less relevant to the focus of the current study (PO3, PO4, O1, O2), and FT10 that did not have a comparable electrode in the EEG system used in Estonia, were taken out from the cap but instead placed to the back of the participant's earlobes that were later used as an offline reference and on the participant's face (i.e., above, below, and at the outer canthus of the left eye) to record blinks and eye movements (Supplementary Figure S2).The raw EEG was recorded with 59 active electrodes.
For the experiment in Estonia, EEG was recorded with Biosemi ActiveTwo 64-channel system (Biosemi B.V., Amsterdam, the Netherlands) at the sampling rate of 512 Hz and with an online filter of 0.16-100 Hz (Supplementary Figure S2).Two additional electrodes were placed on the back of the participant's earlobes, later used as an offline reference.Four additional electrodes were placed on the participant's face (i.e., above and below the left eye and at the outer canthi of the left and right eyes) to record blinks and eye movements.

EEG pre-processing
To minimize raw data heterogeneity across sites, the data from both China and Estonia were first converted to have equivalent reference and bandwidth in BrainVision Analyzer 2.1 (Brain Products GmbH, Munich, Germany).The data were re-referenced to the average of the earlobes, and a Butterworth Zero Phase Filter (0.1-30 Hz, 24 dB/oct) was applied to the continuous EEG data.After that, the Gratton and Coles algorithm (Gratton et al., 1983) was used to reduce the influence of eye movements and blinks.The data were segmented from 100 ms before to 600 ms after the stimulus onset.Baseline correction was calculated based on a 100 ms prestimulus interval.For each segment, electrodes were marked as artifacts if the signal exceeded the following criteria: 50 µV as the maximum allowed voltage step, 200 µV as the maximal allowed absolute difference in an interval of 200 ms, − 75 and 75 µV as the minimal and maximal allowed amplitudes, and 0.5 µV as the lowest allowed activity in an interval of 100 ms.All channels were included for the artificial detection.The data in each channel were then averaged within each subject for each event type (i.e., standard and deviant in each condition) in each series.Each subject's averaged data of each event type in each series were exported for further analysis in R version 4.2.2 (R Core Team, 2024).
Difference waves (i.e., the MMN) were obtained by subtracting the waveforms of the standards from that of each deviant.The MMN amplitude was calculated with a time window of ± 20 ms centered on the detected MMN peak at the Fz electrode for each participant's each deviant type (Luck, 2014;Qin et al., 2021;K. Yu et al., 2019).The MMN usually emerges at about 100-250 ms after the difference is detected by the brain (Kujala et al., 2007;Näätänen, 1992;Näätänen & Winkler, 1999).We identified the peak latency of MMN at Fz in the a priori time window of 100-250 ms after the onset of the difference in the Estonian stimuli and their corresponding non-speech pure tones and the a priori time window of 100-200 ms after the onset of the difference in the Chinese stimuli and their corresponding non-speech pure tones, as the lasting time of the differences in Sada word and Sada pure tone was slightly longer than that in Jidi word and Jidi pure tone.Since there were two cues in the DurPitch condition and we did not know which cue would dominate over the other, a larger time window with an onset of the time window of the Pitch deviant and offset of the time window of the Duration condition was applied to the DurPitch condition.The onsets of difference and the time windows used for detecting the peak of MMN at Fz in each condition are given in Supplementary Table S1.

Statistical analyses
For the MMN data, we first determined whether participants elicited MMN responses by comparing the MMN amplitude against zero for each group's each condition in each series in each ROI.Afterward, separate linear mixed-effects models (LMM) were run for the Estonian stimuli and their corresponding non-speech pure tones and the results of the Chinese stimuli and their corresponding non-speech pure tones.For the Estonian stimuli, we expected Estonian native speakers to outperform Chinese native speakers in perceiving the Duration change and DurPitch change but did not provide clear predictions on the Pitch change.Since whether or not the group differences in each condition (i.e., Duration, Pitch, DurPitch) significantly differ from each other is not the focus of the present study, we ran three LMM models on each condition separately.To examine how the two groups of participants perceive the speech and non-speech sounds differently and how the MMN amplitude was demonstrated in the left and right hemispheres differently, we included group (− 0.5 Chinese group vs. + 0.5 Estonian group), stimulus type (− 0.5 speech vs. + 0.5 non-speech), and ROI (− 0.5 left frontal vs. + 0.5 right frontal) as sum-coded independent variables in each model.For the Chinese stimuli, we expected Chinese native speakers would outperform Estonian native speakers in perceiving the Pitch change in the Chinese stimuli.Similarly, given that it was hard to predict the results of the Duration and DurPitch conditions and that whether or not the group differences in each condition (i.e., Duration, Pitch, DurPitch) significantly differ from each other is not the focus of the present study, we ran three LMM models on each condition separately.The models were constructed the same way described for the Estonian stimuli and their non-speech pure tones.
Regarding the behavioral data, trials that the subjects missed were excluded from the analysis.For each subject, we calculated the sensitivity (d') score for each condition (i.e., Duration, Pitch, and DurPitch) in each type of stimulus type (i.e., speech and non-speech), respectively.As a result, each subject had six sensitivity scores.Although only nonnative stimuli were used in the behavioral experiment, we adopted the same approach as previously described to analyze the behavioral data by running separate LMMs for each condition (i.e., Duration, Pitch, Dur-Pitch).Three LMM models were fun.The sensitivity score was included in each model as the dependent variable.Stimulus type (− 0.5 speech vs. + 0.5 non-speech) and group (− 0.5 Chinese group vs. + 0.5 Estonian group) were sum-coded and included as independent variables in the model.Subjects were included as random intercepts.
In all statistical analyses, LMMs were fit with the lme4 package version 1.1-31 (Bates et al., 2015) in R version 4.3.3 (R Core Team, 2024).The lmerTest R package version 3.1-3 (Kuznetsova et al., 2017) was used to demonstrate the p value.For all LMMs, significant interactions were followed by post hoc tests with Bonferroni correction, performed with the emmeans R package version 1.8.7 (Lenth, 2024).Confidence intervals were calculated using the confint function in R.

Results
Since our interest lies in the group difference between Chinese and Estonian participants, only the effects where group is involved will be presented and interpreted.In the presence of a significant higher-level interaction, lower-level interactions and main effects will not be interpreted.

MMN results
The results of the one-sample t-tests revealed that participants elicited a significant MMN in all conditions (see Supplementary Table S2 for mean MMN amplitudes in each condition).We additionally used generalized additive mixed effects models to visualize the non-linear differences between standard and deviant over time, following the steps described by Wieling (2018).These plots provide naked-eye visualization of the MMN over time and are available in Supplementary Figures S3 and S4.The full statistical output of the LMM models is provided in Supplementary Tables S3 and S4.The averaged ERP activities and the topographic maps for Sada word and Sada pure tone are shown in Figs. 2 and 3, respectively, and the averaged ERP activities and the topographic maps for Jidi word and Jidi pure tone are shown in Figs. 4 and 5, respectively.Fig. 2. Averaged ERP Activity, the MMN, and Topographic Maps for Sada Word for Chinese (CHN) and Estonian (EST) Groups.Note.The gray-shaded areas show the time window included in statistical analyses, i.e., the MMN amplitudes calculated ± 20 ms around the MMN peak at Fz. Topographic maps illustrate the difference between standard and deviant (deviant minus standard) in the grey-shaded areas.

Results of the Chinese stimuli and their corresponding non-speech pure tones
For the Duration condition, we found a significant interaction between stimulus type and group (β = -0.453,SE=0.229, df = 366, t = -1.975,p = 0.049, CI.lower = -0.900,CI.upper = -0.006),but post hoc tests did not show any significance (ps > 0.3).For the Pitch condition, we found a significant interaction between stimulus type and group (β = -0.710,SE=0.219, df = 366, t = -3.245,p = 0.001, CI.lower = -1.136,CI. upper = -0.284).Post hoc tests showed that whereas there was no group difference in the processing of non-speech pure tones (p > 0.4), Chinese speakers showed significantly larger MMNs than Estonian speakers to

Discussion
In this study, we investigated how native language background affects the perception of duration and pitch in foreign speech sounds and their corresponding non-speech pure tones.We focused on the Estonian language, whose phonological system features both a duration cue and a pitch cue, and Mandarin Chinese, a tonal language where the alternation of the lexical tone discriminates word meanings.Speakers of both languages participated in an MMN experiment where they passively listened to the stimuli in both languages and their corresponding nonspeech pure tones, followed by a behavioral AX discrimination task where they were asked to attentively discriminate the non-native language stimuli and their corresponding non-speech pure tones.The MMN results showed the following: 1) Chinese participants showed larger MMN responses than Estonian participants over the right hemisphere in response to the duration change (i.e., the Duration condition) in the Estonian stimuli and their corresponding non-speech pure tones; 2) Chinese participants showed larger MMN responses than Estonian participants to pitch information (i.e., the Pitch and DurPitch conditions) in the Chinese stimuli, but no group difference was found in the Estonian stimuli; and 3) Estonian participants showed larger MMN responses than Chinese participants in response to the corresponding non-speech pure tones of the Estonian stimuli that contained both duration and pitch changes (i.e., the DurPitch condition).In the AX experiment, where the perception of non-native stimuli was examined, Chinese participants showed a lower sensitivity to the duration change but a higher sensitivity to the pitch change than Estonian participants.

Perception of duration by Chinese and Estonian speakers
We found right-lateralized MMN responses among Chinese native speakers, but not among Estonian native speakers, in the Duration condition of both Sada word and Sada pure tone.Previous studies showed that acoustic information is dominantly processed in the right hemisphere for both native and non-native speakers, but phonological information (i.e., language-specific auditory processing related to meaning) is left-lateralized, especially among native speakers (Gandour et al., 2004;Pulvermüller et al., 2001;Y. Wang et al., 2001;Xi et al., 2010).In the present study, although the deviant in the Duration condition (i.e., Q2-290) did not represent a typical Q2 or Q3 (i.e., it had a large S1/S3 ratio but lacked a pitch cue; see 2.2 Stimuli) and thus its difference from the standard was more at a lower acoustic level, Estonian speakers were still better than Chinese speakers at detecting it, showing comparable MMNs over the two hemispheres.Our results are consistent with Gu et al. (2013), who found right lateralized MMNs in response to duration change in pure tones among Cantonese native speakers whose L1 is a tonal language and does not have a duration cue.Because duration in Chinese is not used phonologically, in the present study, Chinese participants perceived duration in a non-native context as acoustic information without linguistic meaning, regardless of whether it is speech sound (i.e., Sada word) or not (i.e., Sada pure tone).
Our behavioral results additionally showed a lower sensitivity among Chinese native speakers than Estonian native speakers when they attentively discriminated the duration change in the non-native language and their corresponding non-speech pure tones.Given that duration does not determine lexical meaning in Mandarin, Chinese native speakers tended to judge the short and long duration in Estonian to be from the same category.The results are consistent with Šimko et al. (2015), who showed that Mandarin speakers were less accurate in duration judgments than Estonian, Swedish, and Finnish.Conversely, when Estonians listened to the stimuli in Chinese, they tended to use their L1 knowledge to distinguish the duration contrast, resulting in a higher sensitivity in the Duration condition.Our behavioral results corroborate the MMN results in showing that at both the attentive and pre-attentive levels, the lack of exposure to a specific sound feature in one's L1 affects the perception of the non-native language and their corresponding non-speech pure tones.
As one of our reviewers pointed out, since the duration change in the present study was manipulated on the V1 of a disyllabic context, the change could be alternatively interpreted as a segmental change because, at the change point, it was already C2 for the short stimulus Q2-170 but still V1 for the long stimulus Q2-290.The duration cue in Estonian quantity always occurs in disyllabic contexts (Lehiste, 1997;Lippus et al., 2009).Manipulating the V1 duration of the Estonian disyllabic word saada 'to send' (Q2) represents how duration is typically used in the Estonian quantity.A similar manipulation has been used in studies on the Finnish quantity (e.g., Kirmse et al., 2008;Ylinen et al., 2005Ylinen et al., , 2006)).For example, Kirmse et al. ( 2008) investigated Finnish and German speakers' perception of vowel duration change embedded in the first-or second-syllable vowel of the pseudoword sasa.They found shorter MMN latencies for the Finns than the Germans when perceiving vowel duration, suggesting a generally higher sensitivity to duration contrasts in the Finnish group than the German group due to the phonetically more crucial duration cue in the Finnish language than in the German language.Our results and studies on the Finnish quantity demonstrated how speakers of quantity languages benefit from their native language knowledge in perceiving duration change in speech sounds, particularly in a disyllabic context.It has been shown that such an advantage can be generalized to the perception of duration change in other contexts (e.g., short non-speech sounds, as found by Tervaniemi et al. (2006)), although we cannot provide direct evidence for it with the current design.
One limitation of manipulating duration in natural speech is that the pitch of the stimuli is dynamic, and therefore, when stretching out the duration of the segment, the pitch contour was inevitably stretched so that proportionally, it stayed the same.The Duration-only deviant in the present study came with a small variation in pitch compared to the standard (see Table 1), which may set restrictions in interpreting and generalizing our results.We suggest caution is needed when considering our results and adopting the same manipulation method in future relevant studies.

Perception of pitch by Chinese and Estonian speakers
We found larger MMNs among Chinese participants than Estonian participants in the Pitch and DurPitch conditions of Jidi word.Our results are consistent with previous research that shows an advantage of Chinese speakers' perception of the Chinese lexical tone over non-native speakers (e.g., Chandrasekaran et al., 2007;Shen & Froud, 2019).The deviants in the Pitch and DurPitch conditions of Jidi word were from a different category (i.e., T2) and have a different meaning than the standard (i.e., T1).While Chinese speakers' early cortical processing was tuned to detect the cross-category/phonological difference, Estonian speakers who do not use pitch contrastively in their L1 might only perceive the difference as physical features.On the other hand, the MMN results of the Pitch and DurPitch conditions in Sada word showed no group difference, suggesting that both Chinese and Estonian speakers can detect the pitch information in the Estonian language.Being native speakers of a tonal language, Chinese participants benefited from their L1 tonal knowledge in automatically discriminating the pitch information in a non-native language like Estonian, no matter if it is a pitch-only or a complex context where pitch is only one of the cues.
In the AX discrimination task, where participants were presented with the non-native language and their corresponding non-speech pure tones, Chinese subjects showed a significantly higher sensitivity to pitch change than Estonian subjects in the speech but not non-speech stimuli.It has been proposed that a phonological contrast in a foreign language that closely matches a contrast in L1 is discriminated clearly and acquired easily as it can be assimilated to an existing contrast in L1 (see PAM; Best, 2019;J. Chen et al., 2020).When attentively discriminating the Estonian stimuli, it is possible that Chinese speakers assimilated the flat pitch in Q2 saada 'to send' to Mandarin T1 (a high-level tone) and the falling pitch in Q3 saada 'to get' to Mandarin T4 (a high-falling tone).Assimilating the non-native speech sounds to two distinct categories in their L1 leads to Chinese participants' high accuracy in the discrimination task.The non-speech pure tones created from the Estonian stimuli, being foreign and far away from the speech domain, did not elicit a similar advantage.The pitch change in Jidi stimuli presented to Estonian native speakers, however, involves a rising pitch in T2 ji2di4 'polar area' that does not have an equivalent in their L1.It may thus confuse when Estonian participants tried to discriminate the two Jidis.If T1 ji1di4 'base' and T2 ji2di4 'polar area' were assimilated to the same category (e.g., among some subjects), then discrimination tended to be less accurate.The behavioral results, together with the MMN results, showed a relatively better pitch discrimination ability among Chinese speakers than Estonian speakers.
It is worth noting that visual inspection showed prolonged and seemingly larger MMNs at around 380-400 ms in the Pitch condition of both Sada word and Sada pure tone (see Figs. 2 and 3).As we kept the C1, C2, and V2 of the Sada base words unmanipulated, the Pitch condition in the Estonian stimuli and their corresponding non-speech pure tones came with an additional duration change in the second syllable (C2V2) that onsets around 334 ms (=90 ms (C1) + 170 ms (V1) + 74 ms (C2); see Table 1).The MMNs in the later time window of the Pitch condition could be a prolonged effect of the pitch change on V1 and/or an effect of the additional duration difference in the second syllable.However, since the purpose of the Pitch condition is to examine the discrimination of the pitch on V1, the prolonged MMNs in this condition are not the focus of the paper.

Facilitation of the processing of non-speech sounds among Estonian speakers
Estonian native speakers showed larger MMN to Sada pure tone than Chinese native speakers in the DurPitch condition.The deviant in the DurPitch condition Q3-290 had a pitch difference in V1 that onsets at 90 ms and is expected to elicit the MMN between 190-340 ms (i.e., 100-250 ms after the onset of the difference; see Kujala et al., 2007), a duration difference in V1 that onsets at 260 ms and is expected to elicit the MMN between 360-510 ms, and an additional duration difference in the second syllable (C2V2) that onsets at around 334 ms and is expected to elicit the MMN between 434-584 ms.The MMN amplitudes in our statistical analyses were from an average time window of 399-429 ms for Sada word and 426-466 ms for Sada pure tone.The MMN in such time windows could have been caused by either the duration of V1 or the additional duration in the second syllable, or both that are hard to tease apart.
The Estonian quantity is a complex combination of temporal and tonal components.The second syllable of Q3 is extremely short in spontaneous speech (Lippus et al., 2013).In the present study, we recorded the base words in a natural spontaneous speech context and kept the second syllable in the Q3 base word unmanipulated.A previous study shows that Estonian speakers perceived Q2-170 as Q2 with the highest accuracy and Q3-290 as Q3 with the highest accuracy (Lippus et al., 2009).We thus suggest that the Q3-290 deviant in the DurPitch condition is the most linguistically relevant, i.e., it represents a typical Estonian Q3 in natural speech and contrasts with the standard Q2-170 that represents a typical Estonian Q2.Our DurPitch condition represents a typical Estonian Q3-Q2 contrast, where both the duration and pitch cues exist and are inextricable.As our results showed, while Chinese speakers can discriminate the Q3-Q2 contrast in Sada word as Estonian speakers do, the latter benefit more from their L1 knowledge when perceiving the non-speech pure tones created from the Estonian Q3-Q2 contrast.
Previous studies with non-speech sounds revealed larger MMN amplitudes (Tervaniemi et al., 2006) or shorter MMN latencies (Kirmse et al., 2008) in Finnish speakers, relative to German speakers, in response to duration change, which is of linguistic relevance to the Finnish language, rather than frequency change, which is linguistically irrelevant.In the current study, we found larger MMN responses in Estonian native speakers, relative to Chinese native speakers, in response to pure tone stimuli that resemble the complex combination of temporal and tonal properties occurring in the Estonian language (i.e., the DurPitch condition of Sada pure tone).The observed enhanced MMN responses elicited by non-speech pure tones suggest that the effect of native language background is not restricted to the speech domain (see also e.g., Krishnan et al., 2009).With the novelty of examining duration and pitch simultaneously and using language-based non-speech sounds, the present study shows that early pre-attentive cortical processing is selectively tuned to the features of the non-speech auditory signal relevant in a particular language.The observed larger MMN to the nonspeech pure tones in the DurPitch condition among Estonian native speakers could be attributed to their long-term familiarity with the dimensions of the auditory signal that are of linguistic relevance, i.e., a special combination of the duration (a greater-than-two syllable ratio) and the pitch (falling pitch on the first syllable) information in a disyllabic foot.
We did not find any across-domain effects in Jidi stimuli, as no group difference was found in Jidi pure tone in any of the deviant types.We suggest it because the pure tones created from Jidi are patterns equally unfamiliar to the Chinese and Estonian subjects.In Chinese, each syllable is a tone-bearing unit (Lin, 2007).In principle, disyllabic Chinese words could combine any of the four lexical tones.Therefore, the tonal combinations used as our stimuli, i.e., T1T4 and T2T4, do not necessarily "stand out" in Chinese.This pattern also differs from the typical temporal and tonal combination in the Estonian quantity, a prosodic structure that Estonian native speakers are familiar with.Therefore, both groups had a relatively low familiarity with the Jidi pure tone stimuli and only treated them as linguistically irrelevant sound patterns, resulting in the absence of group differences in the MMN responses.

Conclusion
By including a distinct language that uses both duration and pitch linguistically, Estonian, and examining simultaneously duration and pitch, the present study provides novel insights into the immense literature on the effect of native language background on processing foreign languages and non-speech sounds.We showed that long-term familiarity with specific sound features in the native language affects the perception of a foreign language at both the attentive and pre-attentive levels.Chinese speakers perceived the duration change in a non-native language as acoustic information without linguistic meaning.While both Estonian and Chinese speakers could discriminate pitch, Chinese speakers were better at pitch perception due to the long-term exposure to the dominant role of pitch in their native language.Participants could discriminate a foreign language contrast containing two cues if one is the prominent cue in their L1.Additionally, we showed that the advantage of L1 extends to the non-speech domain.The present study suggests that the brain's auditory processing of auditory signals can be selectively tuned to the specific sound features of linguistic relevance.

Fig. 1 .
Fig. 1.Pitch contours of the Estonian quantity (left panel) and Chinese tones (right panel).Note.The Estonian quantity is carried by the words /sata/, and the Chinese tones are carried by the syllable /yi/.

Fig. 3 .
Fig. 3. Averaged ERP Activity, the MMN, and Topographic Maps for Sada Pure Tone for Chinese (CHN) and Estonian (EST) Groups.Note.The gray-shaded areas show the time window included in statistical analyses, i.e., the MMN amplitudes calculated ± 20 ms around the MMN peak at Fz. Topographic maps illustrate the difference between standard and deviant (deviant minus standard) in the grey-shaded areas.

Fig. 4 .
Fig. 4. Averaged ERP Activity, the MMN, and Topographic Maps for Jidi Word for Chinese (CHN) and Estonian (EST) Groups.Note.The gray-shaded areas show the time window included in statistical analyses, i.e., the MMN amplitudes calculated ± 20 ms around the MMN peak at Fz. Topographic maps illustrate the difference between standard and deviant (deviant minus standard) in the grey-shaded areas.

Fig. 5 .
Fig. 5. Averaged ERP Activity, the MMN, and Topographic Maps for Jidi Pure Tone for Chinese (CHN) and Estonian (EST) Groups.Note.The gray-shaded areas show the time window included in statistical analyses, i.e., the MMN amplitudes calculated ± 20 ms around the MMN peak at Fz. Topographic maps illustrate the difference between standard and deviant (deviant minus standard) in the grey-shaded areas.

Fig. 6 .
Fig. 6.Behavioral Discrimination Sensitivity to the Non-native Stimuli for Chinese (CHN) and Estonian (EST) Groups.Note.Error bars represent one standard error.

Table 1
Parameters of the Estonian stimuli Sada and the Chinese stimuli Jidi.