Bilingualism and increased attention to speech: Evidence from event-related potentials

A number of studies have shown that from an early age, bilinguals outperform their monolingual peers on executive control tasks. We previously found that bilingual children and adults also display greater attention to unexpected language switches within speech. Here, we investigated the effect of a bilingual upbringing on speech perception in one language. We recorded monolingual and bilingual toddlers’ event-related potentials (ERPs) to spoken words preceded by pictures. Words matching the picture prime elicited an early frontal positivity in bilingual participants only, whereas later ERP amplitudes associated with semantic processing did not differ between groups. These results add to the growing body of evidence that bilingualism increases overall attention during speech perception whilst semantic integration is unaffected.


Introduction
Bilingualism is thought to have implications for cognition beyond the linguistic domain.For example, bilingual toddlers and adults have repeatedly been shown to outperform their peers on tasks that require suppressing interference from stimuli unrelated to the task at hand, i.e., they display better executive control (Bialystok, 2009).This effect, often referred to as the bilingual advantage, has been observed in infants as young as seven months who are able to switch between tasks when monolingual infants fail to do so (Kovacs & Mehler, 2009) and has even been hypothesised to provide better resilience against dementia (Bialystok, Craik, & Freedman, 2007).Recently however, large-scale studies have cast doubt on the existence of this phenomenon failing to replicate the bilingual advantage in executive control tasks (Dunabeitia et al., 2014;Gathercole et al., 2014).The question of a bilingual advantage is thus hotly debated with some authors proposing that correlates of interference may fail to appear in bilinguals (e.g., indexed by the Stroop effect) because they have globally reduced reaction times (Hilchey & Klein, 2011), whilst others have proposed a limitation of the executive control advantage in the domain of inhibitory control, based on interference measures in a flanker task (Wu & Thierry, 2013).
A bilingual upbringing in the first years of life will predominantly differ from a monolingual one in the range and variety of speech sounds encountered.Previous research has sought correlates of differences in auditory processing between bilingual and monolingual children that stem from such differential exposure and use of language.Indeed, bilingual children appear to have enhanced phonological awareness in offline tasks (e.g., pen-and-paper tasks on phonological awareness; Bialystok, Majumder, & Martin, 2003;Campbell & Sais, 1995) and a greater ability to learn new phonemic rules (Kuo & Anderson, 2012) than their monolingual peers.However, when bilingual language input is non-systematic due to self-reported frequent language switching by caregivers, language learning may be negatively affected (Byers-Heinlein, 2013).However, results of such offline and self-report studies are difficult to replicate (Paap & Sawi, 2014), and tend to differ from studies using online language measures.For example, Bail, Morini, and Newman (2014) found no effect of parental code switching measured in actual conversations on bilingual toddlers' language proficiency measures.Few studies have used online measures of bilingual speech perception or production to establish neural correlates of differences in mono-and bilingual language processing.
In one such brain imaging study on auditory perception of speech sounds, Krizman, Marian, Shook, Skoe, and Kraus (2012) showed that, compared to their monolingual peers, bilingual adolescents display enhanced encoding of a speech syllable (/da/) as reflected in an increased brain stem response.This increase was

Contents lists available at ScienceDirect
Brain & Language correlated with higher performance under high auditory processing load only in the bilinguals.Thus, bilinguals seemed to pay greater attention to the speech resulting in better neural encoding of speech sounds.However, MRI data such as those collected by Krizman et al. (2012) provide no insight into the time-course of attention allocation.Here we studied online speech perception in bilingual and monolingual toddlers using event-related potentials (ERPs).ERPs are electric potentials recorded from the scalp time-locked to the presentation of a particular stimulus (e.g., a word).They provide the opportunity to study the cognitive processes involved in online speech processing with a high temporal resolution.ERP components can be roughly divided into early components reflecting perceptual processes modulated by attention (up to $300 ms after stimulus onset), and later components reflecting more conscious stimulus processing influenced by voluntary allocation of attention and cognitive strategies ($300-600 ms after stimulus onset).Importantly, studies of early ERPs in response to (speech) sounds have shown that maturation of the auditory cortex is directly liked to language learning and that the rate of maturation differs in monolingual and bilingual children (Kuhl et al., 2008;Shafer, Yu, & Datta, 2011).Previous ERP and behavioural studies have shown that, in the first year of life, phonological discrimination ability is progressively reduced to the phonological repertoire of the native language (Cheour et al., 1998;Kuhl et al., 2006).This process seems to develop similarly for monolingually and bilingually raised children (Burns, Yoshida, Hill, & Werker, 2007), although bilingual children sometimes show inconsistent patterns of phoneme discrimination (Bosch & Sebastian-Galles, 2003) with these skills being proportional to their exposure to each of their languages (Garcia-Sierra et al., 2011).Surprisingly, this differential development of the bilingual auditory cortex has been associated with increased language processing ability rather than an impoverished one (Petitto et al., 2012).
In previous studies, we found that bilingual adults and toddlers display a larger P2 ERP response than their monolingual peers in response to an unexpected language change (Kuipers & Thierry, 2012).The P2 is a positive peak occurring approximately 200 ms after stimulus onset.Although the functional significance of the P2 can vary substantially between experimental contexts, in both the auditory and visual perception literature it is often associated with target detection and classification, and it is modulated by attention (Crowley & Colrain, 2004;Luck & Hillyard, 1994).We interpreted the larger P2 response to a language-switch in bilinguals than monolinguals as an index of increased attention to speech in bilinguals, consistent with findings in six month-old bilinguals (Shafer, Yu, & Garrido-Nag, 2012).
With ERPs one can also study the process of meaning integration, which is reflected in the amplitude of the N400 ERP response, a negative deflection peaking between 300-500 ms after stimulus onset (Federmeier & Kutas, 2002).The more a stimulus is semantically unrelated to its context, the more negative the amplitude of the N400 response.This effect is hypothesised to index the additional neural activation prompted by the semantic analysis of a stimulus when it is unrelated to its context.The N400 has sometimes -albeit rarely-been observed to be delayed in adult late bilinguals compared to their monolingual peers (Ardal, Donald, Meuter, Muldrew, & Luce, 1990;Hahne & Friederici, 2001;Weber-Fox & Neville, 1996) suggesting that late bilinguals (but not early bilinguals; Kuipers & Thierry, 2012, 2013;Weber-Fox & Neville, 1996) show signs of slower semantic stimulus integration.Although the N400 has been observed in infants as young as 14 months (Friedrich & Friederici, 2005) and 11 months (Asano et al., 2015), little data on semantic processing are available from children raised bilingually.
Here, we aimed to determine whether attention to online speech is generally greater in bilingual as compared to monolingual toddlers and whether this tends to affect semantic integration efficiency.Some studies have shown that children of this age range outperform monolingual children on tasks that specifically tap into executive control whilst performance in other cognitive tasks seems unaffected (Poulin-Dubois, Blaye, Coutya, & Bialystok, 2011).We presented 2-3 year old monolingual and bilingual toddlers with picture-spoken word pairs either semantically matched or unrelated whilst recording their ERPs.These experimental conditions were embedded in a study on language switch detection (Kuipers & Thierry, 2012).Using the same stimuli and a similar procedure in adult participants, we found that the semantic match condition elicited a relative increase of P2 amplitude in the bilingual participants, whereas semantic processing did not differ between groups, as shown by unaffected N400 amplitudes (Kuipers & Thierry, 2010).

Participants
From each of the two groups of 18 participants tested by (Kuipers & Thierry, 2012), 14 children had the minimum number of trials (>20) and displayed sufficiently stable ERP waveforms for analysis in the matching and unrelated conditions separately.Language ability was assessed with a shortened British version of the McArthur-Bates Communicative Development Inventory (CDI; Hamilton, Plunkett, & Schafer, 2000) and an un-normed Welsh translation, both of which were posted to the caregivers before the experiment with the request to fill in the relevant sections to the best of their knowledge.Children were grouped on the basis of language exposure and proficiency.Monolingual children were reported not to know any Welsh words apart from some basic commonly used words (e.g., ''dioch''; ''thank you''), nor to have (or having had) significant exposure to Welsh (e.g., no Welsh only day care).The bilingual group consisted of children that had balanced estimated knowledge of Welsh and English and were exposed to both Welsh and English on a day-to-day basis (e.g., Welsh nursery, English or mixed language home, or vice versa).The monolingual English children (8 female; mean age 32 ± 3 months) had a mean CDI score (with a 95% confidence interval) of 268 ± 58 words (i.e., 87% the words in the list).The Welsh-English bilingually raised children (9 female, mean age 29 ± 3 months) had an English CDI score 220 ± 48 words (78%) and a Welsh score of 195 ± 65 words (73%).The caregivers of 1 monolingual and 2 bilingual children failed to return the CDI.Mean age did not differ significantly between groups (p > .1),but the difference in scores on the English CDI approached significance (p < .06),which given the low number of returned CDIs, suggests that the monolingual children may have had a higher English CDI score than the bilingual group.However, the total vocabulary score of the bilingual children (the Welsh and English CDI's combined, corrected for cognates) was 360 ± 112, which did not significantly differ from the number of English words for the English children (p > .4;cf.Pearson, Fernandez, & Oller, 1993).The experimental procedures were approved by the ethics committee of Bangor University and a caregiver gave written informed consent before the experiment.

Materials
Participants were presented with picture-spoken words pairs that were semantically matched in half of the trials.In addition, the language spoken was manipulated in an oddball-like paradigm with English the frequent (75%) and Welsh the infrequent (25%) language.We have already reported language change effects (Kuipers & Thierry, 2012) and here we exclusively focus on semantic relatedness between pictures and English words (the low number of Welsh words used being incompatible with ERP analysis).We paired 36 pictures of highly familiar objects or animals with their dominant (basic-level) name and the name of a semantically unrelated picture avoiding phonological overlap with the picture's dominant name.The picture names were recorded by a female speaker without an apparent accent.The mean familiarity of the words was 577 ± 12 and the mean concreteness was 605 ± 6, both on a scale from 100 to 700 (MRC database, Coltheart, 1981, a database based on adult ratings).The mean frequency (out of a million for 24-36 month olds; CHILDES database; Baath, 2010) was 405 ± 110, the mean age of acquisition was 1.95 years ± 0.1 (Coltheart, 1981).The experimenter pressed a button after trials during which the child was not attending the screen and paused the experiment when the child was too distracted to continue.

Procedure
Children were seated on a caregivers lap approximately 1.8 m from a screen on which the pictures were projected (with a 60 Hz refresh rate).The visual angle of the stimuli was maximally 9°.In each trial, a picture was presented on the screen followed 500 ms later by a spoken word played via loudspeakers set to the left and right in front of the participant at an intensity of 60-68 dB.The picture remained on the screen for 2 s, which was more than the duration of any spoken word.Trials were separated by an 800 ms inter stimulus interval.When the child was not attending to the screen, short movie clips were played to recapture the child's attention.

Data acquisition
Event-related potentials were recorded in reference to the onset of the word.Continuous EEG recordings were sampled at 1 kHz and band-pass filtered between 0.1 and 200 Hz from 22 Ag/AgCl electrodes placed according to the 10-20 convention (Fp1, Fp2, F7, F3, Fz, F4, F8, T7, C3, Cz, C4, T8, P7, P3, Pz, P4, P8, O1, OZ, O2, right mastoid) and referenced to the left mastoid.Impedances were kept below 10 KX.Electrodes at peripheral electrode sites were discarded due to excessive noise levels at these locations (e.g., due to pulling on electrode leads and/or resting of the head against the caregiver).The same 9 central electrodes reported previously (Kuipers & Thierry, 2012) were kept for statistical analysis (F3, Fz, F4, C3, Cz, C4, P3, Pz, P4), whilst FP1 was used for monitoring eye-blinks.Off-line EEG recordings were band-pass filtered between 0.3 Hz (24 db/Oct) and 20 Hz (96 db/Oct) using a zero-phase shift band-pass filter and re-referenced to the average of left and right mastoids.Continuous files were scanned for large artefacts and slow electrode drifts due to eye blinks and body movements and using a ±30 lV artefact rejection procedure applied to the reference channel.Visible artefacts (including some remaining eye blinks) were manually removed based on visual inspection.Epochs of À100 to 900 ms relative to the onset of the word were baseline corrected to a 100 ms pre-stimulus interval and an additional ±150 lV artefact rejection procedure was applied to all electrodes to remove the last remaining artefacts, before computing individual averages for each condition.
Trials following a Welsh word were excluded from the analyses as well as trials during which the child was not attending the screen.There were 34 and 33 trials on average in the bilingual and the monolingual group, respectively, and this did not differ between groups (p > .8).

Statistical analysis
Visual inspection of the grand average ERPs (Fig. 1) and difference waveforms (Fig. 2) revealed two time-windows in which the matching and unrelated conditions distinctly differed: An early P2, and a late N400 time-window.To test for potential group differences in these time-windows, we analysed mean amplitudes in the P2 time-window (170-300 ms; see Van Herten et al. (2008) for similar P2 latencies in slightly younger children) and the N400 time window (350-550 ms; Kutas & Federmeier, 2011).Although our P2 window clearly falls outside the P1 time-window (Fig. 1), auditory P1 latencies in response to vowels have been reported to extend into our P2 time-window (Shafer et al., 2011).However, other studies using spoken word stimuli observed similar P1 latencies as here (100 ms; Mills & Sheehan, 2007).Given the fronto-central distribution of the P2 and N400 modulations, we performed ANOVAs on mean peak amplitudes at 6 frontal and central electrodes in the two time-windows with relatedness as within-, and group as between-participant variables.Greenhouse-Geisser corrections were implemented where applicable.

Results
The repeated measures ANOVA in the early time-window (170-300 ms) revealed a significant effect of electrode (F 5,130 = 11.4,p < .001,g = .31),an interaction between group and relatedness (F 1,26 = 7.2, p < .05,g = .22),and an interaction between electrode, group, and relatedness, (F 2.6,66.8= 3.5, p < 0.05, g = .12).A separate ANOVA for the bilingual children, revealed that the effects of electrode (F 5,65 = 9.6, p < .001,g = .43)and relatedness (F 1,13 = 5.4, p < .05,g = .29)were significant.Hence, the bilingual children displayed a more positive ERP response to words matching the picture than unrelated words at frontal and central electrode sites.No significant effects were observed in the separate ANOVA on ERP amplitudes recorded in the monolingual group.Thus, unlike their bilingual peers, the monolingual children's ERP response did not differentiate between matching and unrelated words in the early time-window (see Fig. 3).
The ANOVA on mean ERP amplitude of the late time window (320-580 ms) only revealed a significant effect of relatedness (F 1,26 = 8.6, p < .01,g = .25;Fig. 4).Importantly, the interaction between group and relatedness was far from significant (p > .4).Hence, the ERP response of the two language groups did not significantly differ in the time-window associated with semantic integration.

Discussion
We tested whether a bilingual upbringing affects the way in which speech is attended to at an early age.ERPs were recorded from monolingually and bilingually raised toddlers exposed to matching and mismatching picture-spoken word pairs.We previously reported that bilingual toddlers' ERPs distinguish between languages earlier than those of monolingual peers, which suggests that bilinguals pay closer attention to speech input (Kuipers & Thierry, 2012).Here we tested whether this is also the case when monolingual English and Welsh-English bilingual children process expected and unexpected English words.We found that only the bilingual toddlers displayed an early positive deflection of the ERP for matching vs. unrelated words.These results provide new evidence for the notion that bilingual children pay greater attention to speech than their monolingual peers.
By contrast, ERPs in the time-window associated with semantic integration did not differ between groups which is in line with previous findings that bilingual toddlers are unaffected regarding speech comprehension (Kuipers & Thierry, 2012, 2013;Weber-Fox & Neville, 1996).
Using a similar paradigm and the same stimuli, we previously obtained very similar results in monolingual and bilingual adults.Bilinguals displayed a larger P2 response than monolinguals on right frontal electrodes to matching picture-spoken word pairs    (Cousineau, 2005).
as compared to unrelated pairs whilst N400 amplitude did not differ between language groups (Kuipers & Thierry, 2010).Hence, adult and child ERPs associated with language perception show greater sensitivity to unexpected phonemes in bilingual than monolingual participants, whilst ERPs associated with semantic integration appear unaffected by language background.
Although the neural sources and functional significance of any ERP component can vary widely between experimental contexts, a frontal, auditory P2 (and the visual P2 alike; Luck & Hillyard, 1994) is mostly associated with allocation of attention and stimulus classification (Crowley & Colrain, 2004).Given that the enhanced bilingual P2 and P2-like responses in the different studies were observed for Welsh words used as deviants in a stimulus stream and also by English words matching a picture, it seems that the contrast between languages is apparent to bilinguals in a way that cannot be equated in monolinguals.However, in the current study it is likely that the decision point of stimulus words had not been reached by the time the P2-like positivity was elicited (up to 300 ms).Therefore, the P2 effect may have been elicited by unexpected phonemes rather than whole word forms.
Given our main finding of increased amplitude of attention-modulated ERPs, we propose that speech stimuli prompt greater attention in bilinguals than monolinguals, enabling better and faster distinction and categorisation of the incoming speech stream.In such a framework, stimuli of particular interest, be it due to language or semantic relevance, can be identified more quickly on the basis of their phonological properties.Indeed, bilingual infants have been shown to discriminate similar sounding phonemes from different languages earlier than their monolingual peers (Sundara & Scutellaro, 2011).A mechanism that tracks and classifies incoming speech sounds based on statistical properties, as in PRIMIR (Curtin, Byers-Heinlein, & Werker, 2011), would be beneficial in situations in which there is no control over upcoming stimuli (i.e.listening to speech) and that these stimuli can unexpectedly be of a different class (i.e. a different language).Bilingual speech contains unexpected changes in language (Myers-Scotton, 2005;Poulisse & Bongaerts, 1994), sound-to-word mappings, as well as different word-to-meaning mappings.Fast categorisation and identification of speech sounds would allow a bilingual child to dynamically sustain sentence comprehension.A monolingual child does not have such requirement, which may explain why their ERP response does not differentiate between contextually relevant and irrelevant speech sounds during early perceptual processing.This distinction is made at later stages of speech processing, when the full word form is available for subsequent semantic integration.
By contrast, when critical stimuli are presented visually (e.g., when a word is presented before a picture) we observed in another study that monolingual and bilingual groups showed similar early ERP responses to semantically matching vs. unrelated pictures (Kuipers & Thierry, 2011).Hence, a bilingual upbringing seems to specifically enhance attention to speech.Such enhanced (neural) capacity is in line with observations that bilinguals outperform monolingual peers in tasks that require resources also associated with bilingual language control (Bialystok, 2009).Similarly, computer gaming experts outperform peers on tasks that rely on using those cognitive resources trained by computer game experience (Bialystok, 2006).
The notion that a bilingual upbringing specifically enhances those cognitive abilities required when simultaneously learning two languages has received support from other of subdomains of bilingualism research.On the one hand, bilingual language production from an early age seems to enhance executive control, presumably because of the need for increased (inhibitory) control over two or more languages when speaking (Abutalebi & Green, 2007;Costa, Hernandez, & Sebastian-Galles, 2008).Increased attention requirements for linguistic tasks in bilinguals has also been observed in functional brain imaging (Jones et al., 2012) with bilingual language exposure linked to increased grey and white matter density using structural brain imaging (Li, Legault, & Litcofsky, 2014).On the other hand, bilingual language perception from an early age appears to increase the sensitivity of the brain to speech sounds as observed in dichotic listening tasks (Soveri, Laine, Hamalainen, & Hugdahl, 2011).The question remains whether this bilingual auditory processing advantage can also emerge later in life.That is, whether the brain is plastic enough for bilingual language exposure to result in greater speech sound classification capacity.One study on musical training in 8-10-year-olds seems to suggest that it is possible.Chobert, Francois, Velay, and Besson (2014) showed that one year of music training increases the mismatch negativity response (MMN; Naatanen, Paavilainen, Rinne, & Alho, 2007) to spoken syllables as compared to painting training.Hence, training auditory perception skills appears to boost pre-attentive speech-sound discrimination.However, a stronger bilingual than monolingual auditory brain stem response to speech sounds Krizman et al. (2012) is more evident in early than late bilingual children (Krizman, Slater, Skoe, Marian, & Kraus, 2015).
In conclusion, although an increased executive control advantage in bilinguals may not be observed in any non-linguistic task (Paap & Sawi, 2014), bilingual language processing characteristics (Maurer & Werker, 2014) and brain plasticity (Li et al., 2014) have been associated with cognitive advantages rather than disadvantages.Here we compared electrophysiological indices of attention and semantic integration in bilingual and monolingual 2-3-year-olds and found increased attention to speech for words matching a picture in bilinguals.However, ERP correlates of semantic integration did not differ between groups suggesting that bilingualism does not affect speech comprehension.
Further studies need to address the extent to which bilingual children show increased attention to speech in a fully monolingual context, since in our study the toddlers heard Welsh words in 25% of the trials.It is possible that this may have put the bilingual toddlers in an increased state of arousal.However, it is unclear why this would not also be the case for the monolingual toddlers.Also, since the unrelated words were completely different from children's expectations (the name of the picture), it would be important to establish whether they are also more sensitive to smaller phonological errors such as mispronunciations of known, expected words.Preferential looking data have already shown that subtle variance in pronunciation can be detected by bilingual infants in the absence of such effects in matched monolinguals (Mattock, Polka, Rvachew, & Krehm, 2010).Future studies may also establish the extent to which age of acquisition of a second language influences speech perception.It may be the case that neural plasticity in the first year(s) of life is critical for the development of a bilingual auditory processing advantage.It remains that a consistent pattern of results has emerged, showing that a bilingual upbringing is related to increased attention to speech stimuli (see Maurer and Werker (2014) for a review).
j o u r n a l h o m e p a g e : w w w .e l s e v i e r .c o m / l o c a t e / b & l

Fig. 2 .
Fig. 2. Grand average difference waveforms (unrelated-match) of picture-word pairs corresponding to the electrodes displayed in Graph 1.The grey vertical bar at 0 ms indicates word onset.