Losing access to the second language and its effect on executive function development in childhood: The case of 'returnees'

Abstract This study examined how relative language proficiency and exposure influence the development of executive function (EF) in 7–12 year-old bilingual ‘returnee’ children. Returnees are children of immigrant families who were immersed in an environment where their second language (L2; English) was the majority societal language and returned to their native language (L1; Japanese) environment after the period of prolonged, naturalistic L2 exposure. Targeting this population allows us to address the question of how the loss of opportunities to engage in bilingual activities may longitudinally affect EF development. We administered EF inhibition and monitoring/updating skills tasks shortly after the children's return to their L1 environment and again one year later. The results showed that the amount of reduction in L2 exposure (i.e., the difference in L2 exposure when they lived in an L2 majority language environment vs. back in the L1 environment) affected children's monitoring and updating abilities. The greater reduction the children experienced in L2 exposure, the smaller their improvement was on the updating task in the second interval. The finding suggests that losing access to one's L2, that is, less active bilingualism is associated with attenuated effects in EF development.


Introduction
Executive function (EF) is defined as a set of general-purpose control processes that regulate goal-directed thoughts and actions (Miyake & Friedman, 2012). EF is necessary to solve complex and novel problems and to accomplish desired goals (Elliott, 2003). Although several frameworks of EF has been proposed (e.g., Dual Mechanism Framework: Braver, 2012;Braver, Gray, & Burgess, 2007; The Adaptive Control Hypothesis: Green & Abutalebi, 2013), the unity and diversity model (Miyake et al., 2000) has been widely adopted in research examining the cognitive effects of bilingualism. This model encompasses three partially separable components: (i) inhibition, the ability to suppress irrelevant responses, (ii) shifting, the ability to switch flexibly between mental sets or tasks and (iii) updating, the constant monitoring and rapid addition/deletion of working memory contents (Miyake et al., 2000). EF emerges in the first years of life and continues to develop throughout childhood and adolescence (Best & Miller, 2010). Among other factors that influence EF development, bilingualism has attracted increasing attention over the last two decades (for further discussions see Bialystok, 2009;Bialystok, Craik, & Luk, 2012). Research on bilingualism effects to EF has yielded mixed findings, https://doi.org/10.1016/j.jneuroling.2020.100906 Received 27 November 2019; Received in revised form 19 January 2020; Accepted 11 March 2020

The effects of bilingualism on EF
Since both languages of bilinguals are constantly activated in language processing (e.g., Dijkstra, Van Jaarsveld, & Ten Brinke, 1998;Hernandez, Bates, & Avila, 1996;Kaushanskaya & Marian, 2007;Kroll, Bobb, & Wodniecka, 2006;Marian, Spivey, & Hirsch, 2003), bilinguals need to monitor their co-activated languages, shift between their L1 and L2, and inhibit the activation of the unwanted language. These mental processes involved in bilingual language comprehension/production are (partially) governed by domain-general EF processes. Thus, since bilinguals have more training in controlling competing sets of languages, this in turn may give rise to enhancement in EF.
The Inhibitory Control Model (Green, 1998) integrates cognitive and linguistic processes by focusing on the role of inhibition. It assumes that continuous interference between two languages is resolved through inhibition of the conflicting activation of the nontarget language. An important source of evidence for the role of inhibitory control comes from studies examining language-switching performances in bilinguals. For example, Meuter and Allport (1999) have found an asymmetrical switch cost in the bilinguals' two languages: switching into L1 (dominant language) was slower than switching into L2 (non-dominant language. This process can be understood as a consequence of inhibition: more cognitive resources are required to suppress the dominant, highly activated language, and therefore releasing that inhibition takes more time. Most studies that have examined switch costs (Costa & Santesteban, 2004;Philipp, Gade, & Koch, 2007;Schwieter & Sunderman, 2008;Verhoef, Roelofs, & Chwilla, 2009) appear to define language dominance as bilinguals' relative proficiency in each language, with unbalanced bilinguals showing a larger switch cost for the dominant than for the non-dominant language (Philipp et al., 2007;Schwieter & Sunderman, 2008), and balanced bilinguals exhibiting similar magnitude of switch costs for both languages (Costa, Santesteban, & Ivanova, 2006). However, other studies suggest that patterns of switch cost may depend on the relative frequency with which the dominant and non-dominant languages are involved in the switching task (Timmer, Christoffels, & Costa, 2019) or the extent to which the dominant or non-dominant language is used shortly before undertaking the switching task (Declerck & Grainger, 2017). The inhibitory mechanism reflected by switch cost may be influenced by an interaction between how well bilinguals can operate in each language (e.g., proficiency) and the linguistic context in which the speaker is placed (e.g., exposure and use).
Among studies that have examined the influence of bilingualism on EF in children (e.g. Crespo, Gross, & Kaushanskaya, 2019;Crivello et al., 2016;Iluz-Cohen & Armon-Lotem, 2013;Kuzyk, Friend, Severdija, Zesiger, & Poulin-Dubois, 2019;Marton, 2015a;Nicolay & Poncelet, 2013), many have focused on two dimensions of bilingualism: proficiency and language exposure. In terms of proficiency, a specific study compared inhibition and shifting abilities in balanced high-proficiency bilingual children, L2-dominant, L1-dominant, and (balanced) low-proficiency bilinguals (Iluz-Cohen & Armon-Lotem, 2013). The findings reveal that balanced highproficiency, L2-dominant, and L1-dominant bilinguals all outperformed the (balanced) low-proficiency bilinguals on inhibition, and balanced high-proficiency and L2-dominant bilinguals outperformed the L1-dominant and (balanced) low-proficiency bilinguals on shifting. The results suggest that having high proficiency in at least one language (and more favorably both) may have positive effects on EF. In another study that measured the direct correlation between language proficiency and EF (Marton, 2015a), 8-11 year-old children with higher L2 English proficiency (but with various L1) performed better on monitoring task than children with lower L2 proficiency. It should be emphasized here, however, that proficiency in the target language (as in Marton, 2015a) and relative proficiency of the two languages (as in Iluz-Cohen & Armon-Lotem, 2013) do not always go hand in hand; a child who is proficient in the L2, for example, does not always have the same level of proficiency in the L1, especially if the L2 is spoken in a majority language context and L1 in a minority language context (i.e., proficiency in the L2 can sometimes be higher than the proficiency in the L1).
Language exposure is more loosely defined in the literature and can relate to the current or cumulative frequency of input (e.g., number of hours/day a child is exposed to a given language at the time of testing and over time in the past) or more broadly to the environmental context in which a bilingual is placed (e.g., Spanish bilinguals living in Spain consequently have more exposure to Spanish). For instance, Crespo et al. (2019) have shown that increased exposure to dual language input (i.e., a situation where both languages are available to the child) contributed to decreased switching and mixing costs in children with robust proficiency in their dominant language, suggesting that the effect of dual language input on EF is modulated by proficiency. Other studies have examined EF of L2 child learners enrolled in an immersion program (Bialystok & Barac, 2012;Carlson & Meltzoff, 2008;Nicolay & Poncelet, 2013;Poarch & van Hell, 2012;Woumans, Surmont, Struys, & Duyck, 2016). A prominent example is a study that compared cognitive performance among Serbian second graders enrolled in a high exposure L2 immersion program (around 5 h of daily exposure for one year), low exposure immersion program (around 1.5 h of daily exposure for one year), and a control monolingual group (Purić, Vuksanović, & Chondrogianni, 2017). The high exposure immersion group performed better than the other two groups on complex working memory tasks, but no group differences were found for inhibition and shifting. The authors suggest that working memory may be a facet of EF that may be especially sensitive to early stages of intensive L2 learning. Taken together, the findings from previous studies show that while daily exposure to the L2 for six months may not be adequate to boost EF in bilingual children (Carlson & Meltzoff, 2008), a year (Purić et al., 2017) or three years of exposure (Bialystok & Barac, 2012;Nicolay & Poncelet, 2013) may have a positive effect on EF development. The effect of exposure that is corroborated by these findings, however is often confounded with proficiency and vice-versa (i.e., increased L2 exposure generally results in better L2 proficiency), hence making it difficult to disentangle the effects of exposure from proficiency. Indeed, proficiency and exposure/use are two distinct dimensions of bilingual experience, yet they correlate strongly with each other (Luk & Bialystok, 2013).

Aims and predictions
The aim of the current study is to examine whether language exposure and proficiency-both of which are proxies for bilingual experience-have any effect on the developmental trajectories of EF in a unique context where opportunities to actively engage in bilingualism become limited. We predict that language proficiency at the onset of their return to the L1 environment should modulate returnee children's development in inhibition. The idea here is that children who were more proficient in their L2 while living abroad presumably had to apply less inhibition to their L1 than children who were more proficient in their L1 or had balanced proficiency in both languages. However, when they return to the L1 environment and the dominant language of the environment changes, they may have to inhibit their stronger language (L2) to a greater extent and thus experience the greatest change in cognitive demands of their language environment. Therefore, children who were more proficient in their L2 at the onset of their return to the L1 environment may improve better on inhibition task than those whose L1 and L2 proficiency were balanced or those who were more proficient in their L1. We do not expect relative proficiency to affect returnee children's development in monitoring/updating, as there is a lack of theoretical basis as to why relative proficiency at the onset of return would determine the developmental trajectory of monitoring/ updating within the context of a language dominance shift. Rather, we predict that L2 exposure should modulate the development of both updating/monitoring and inhibition ability-children who continued to receive L2 exposure may have more opportunities to practice inhibiting the non-target language (L1), monitoring the two languages, and updating information in two different languages, and thus experience the greatest increase in inhibition and updating/monitoring performance over time.

Participants
This study (project title: L2 attrition and L1 acquisition in Japanese-English bilingual children) was approved by the University of Edinburgh Linguistics and English Language Ethics Committee (protocol number 11-1516/5). The participants were 36 Japanese-English bilingual children, who acquired English as a second language in a foreign country and had recently returned to Japan. In the first round of testing, 38 children participated but two dropped out in the second round of testing. All of the bilingual children's parents spoke Japanese as their native language and the children were exposed to Japanese from birth. They all come from families with high socio-economic status (i.e., at least one parent has university-level education and works in large-revenue companies). The bilingual children were exposed to minimal English before moving to the L2 environment. Thus, they started acquiring English as their L2 when they moved to the foreign environment. Bilingual children's age at Time 1 (first test session) and Time 2 (second test session), age of L2 onset, length of residence in the L2 environment, age of return to the L1 environment, length of time between return to L2 environment and first round of testing (i.e., incubation period) are summarized in Table 1.

Instruments
Language exposure measurement. The Bilingual Language Experience Calculator (BiLEC) was administered to the parents in order to gather information about quantitative language exposure in each language and history of language use (for further information see Unsworth, 2016). This exposure measure takes into account where and with whom the child spent time on an average day in the week and also on the weekend, for how long, and which language(s) each person used when speaking with the child, as well as time spent on extra-curricular activities and the language(s) in which these occurred (Unsworth, Chondrogianni, & Skarabela, 2018). Exposure to L1 Japanese and L2 English when they lived in an L2 majority language environment (measured at Time 1) and when they were back in Japan (measured at Time 2) were extracted using this questionnaire. We computed the difference in L2 exposure between the two language environments as an indicator of the proportion of reduction in L2 exposure. Higher numbers indicate greater reduction in L2 exposure over time. We used the difference in L2 exposure for further analyses, instead of the amount of L2 exposure in Japan, since there was very little variability in the amount of L2 exposure the children received in Japan (mean = 4.5%, SD = 3.2; also presented later in Results section, Table 2). It is important to note here that since L1 and L2 exposure always add up to 100%, the proportion of reduction in L2 exposure correlates perfectly with the proportion of increase in L1 Japanese exposure.
Proficiency test. We used a category verbal fluency task to measure returnee children's language proficiency. The participants were asked to name either (1) animals or (2) fruits and vegetables in English or Japanese. Half of the bilinguals named animals in English and fruits and vegetables in Japanese, and vice-versa for the other half of bilingual participants. The children were asked to name as many animals or fruits and vegetables as they can in 1 minute. Repeated words were omitted from the total number of correct responses generated within 1 minute. The difference in the total number of unique words between English and Japanese at Time 1 (i.e., at the onset of return to Japan) were used for further analyses as a measure of their relative proficiency. Higher values indicate stronger proficiency in English.
Executive control tasks. The following are the descriptions of the two EF tasks administered in the current study. All of these tasks were administered on a laptop computer (15-inch screen). The experiment was constructed and administered with E-prime 2 (Psychology Software Tools, Inc., Pittsburgh, PA), which recorded accuracy and response times.
Simon. The Simon task measured the ability to suppress irrelevant responses and control interferences. On each trial, participants saw a target picture (either a shoe or a frog) on top of the screen and on the corresponding response key (i.e., either the 'shoe' key or the 'frog' key). To minimize working memory demands, small pictures of a shoe and a frog were displayed at the bottom corners of the screen, each on the same side as the corresponding response key (this side was counterbalanced across participants). On congruent trials, the target was presented on the same side as the matching response, whereas it was presented on the opposite side on incongruent trials. There were 13 practice trials followed by 40 test trials including 20 congruent and 20 incongruent trials in a  Table 2 Summary of BiLEC variables split by language and time; 'Abroad' indicates language exposures of when the children lived in a L2 majority language environment and 'Japan' indicates exposures of when the children returned to Japan; 'Difference' indicates the difference in exposure between foreign and Japanese envrionment (the numbers are in percentages). random order. The stimuli in the test trial disappeared after a certain amount of time, tailored to each participant's mean response time in the practice trials, in order to make the task challenging for each child regardless of differences in processing and motor speeds. The limit was calculated by multiplying the mean reaction time in practice trials by 1.5. The Simon effect was calculated by subtracting the average reaction time on the congruent trials from the incongruent trials. Accuracy was scored as 1 for correct response and 0 for incorrect response. N-back. The N-back task (adapted from Chevalier, 2018) measured the ability to update information in working memory and to monitor task sets. In this task, children saw series of pictures presented one at a time and had to press a response key each time the current picture matched the picture presented n trials back. The participants completed three difficulty levels of the N-back task. Each level contained a series of 32 pictures. There were four different pictures (smiley face, cat, house, airplane) used in each level. These pictures were presented one at a time for 1500-ms, preceded by a 500-ms fixation cross. The pictures appeared eight times in each level in random order. Children were instructed to press the space bar if they saw a picture that was the same as the one presented one trial back (1-back), 2 trials back (2-back), or 3 trials back (3-back). The order of the three levels was fixed, in order to first familiarize the children with the easiest level (1-back) and end with the most difficult level (3-back). In each level, there were eight target pictures (matched) and 24 non-target pictures (unmatched). The participants had 1500-ms to press before the target changed. When the participants pressed the space bar at the correct matched picture, a green tick appeared for as a form of feedback. In contrast, a red cross appeared if the participants incorrectly pressed the space bar at the unmatched picture. Correct scores were given when the participants pressed the space bar on the matched picture (i.e., hit trial), or gave no response to the unmatched picture (i.e., correct rejection). The responses were computed as incorrect when the participants did not press the space bar on the matched picture (i.e., miss trial) or gave a response to the unmatched picture (i.e., false alarm). Upon completion of each block, the total percentages of correct and incorrect responses were presented on the screen. Each block was preceded by a practice session and there was a small break in between each block. Accuracy was scored as 1 for correct response and 0 for incorrect response. Reaction time and accuracy were collapsed across level and used for further analyses.

Procedure
The first test session was conducted when the children had recently returned to Japan from a foreign environment (mean interval time = 3.6 months), and the second test session was conducted a year after. The administration procedure of the language tests and EF tasks were identical for the first and second test session. The experiment was administered at the participants' home or at classrooms provided by Japan Overseas Educational Services, which is an organization that supports education of returnee children. The order of the EF tasks was counterbalanced across participants. The EF tasks lasted approximately 30 min in total for each child. A Japanese-English bilingual researcher spoke to the children in their comfortable language (either Japanese or English) when administering the EF tasks. As the current study was part of a larger project on language attrition in bilingual returnee children, other language tests not reported here were conducted on the same day as the EF tasks. EF tasks took place in between the language tasks (e.g., Japanese-EF-English or English-EF-Japanese).

Data analysis
We ran a mixed effects model to examine whether there were any differences in the EF performance from first to second round of testing, as well as the effect of individual variables (proficiency and language exposure) on bilinguals' EF performance. We used linear mixed effect modeling for reaction time (RT) and generalized linear mixed effect modeling for accuracy. Before running the models with RT as the dependent variable, the RTs of inaccurate trials of all EF tasks were removed. RTs were log-transformed to correct for non-normality. On the N-back task, RTs of hit trials were used for further analyses. We constructed a model for each EF task with RT or accuracy as the dependent variable. For the N-back model, Time (Time 1: at the onset of return to Japan, Time 2: a year after returning to Japan), L2 exposure, Age of L2 onset, and Age at Time 1were included as fixed effects. For the Simon model, Time, Trial type (congruent vs. incongruent), L2 exposure, Proficiency, Age of L2 onset, and Age at Time 1 were included as fixed effects. Age of L2 onset and Age at Time 1 were included as a means of controlling for variances that arise due to differences in age of L2 onset and the age at the time of first test session. All the continuous variables (i.e., Proficiency, L2 exposure, Age) were centered around the mean. 'Time 2' was set as a reference for the categorical Time variable. Subject and Item were entered as random intercepts. Models were fit in R (R Core Team, 2013) with the package "lme4" (Bates, Mächler, Bolker, & Walker, 2015). On lmer/glmer outputs, we performed mixed model ANOVA tables via likelihood ratio test using the afex package (Singmann, Bolker, Westfall, & Aust, 2015). Using the afex package automatically applied sum coding to categorical variables.

Exposure
The summary of the results from BiLEC questionnare is presented in Table 2. There was a significant decline in L2 English exposure from when the children lived in a L2 majority language environment to when they returned to Japan (t(33) = 23.68, p < .001). The smallest amount of reduction in L2 exposure that a child has experienced was 26.5%, and the largest amount of reduction in L2 exposure that a child experienced was by 58.0%. We should highlight the fact that despite living in a L2 majority language environment, children received on average 53.2% of their exposure in Japanese, and the other 46.8% in English, suggesting that their language input in Japanese and English was balanced, rather than English-dominant. However, the rate of exposure is clearly dominant in Japanese when they returned to Japan, with children only receiving 4.5% of their exposure in English and the other 95.5% in Japanese. As highlighted in the Methodology section, we used the difference in L2 exposure between L2-majority language context and Japan for further analyses, since there was very little variance in the amount of L2 exposure the children received in Japan, with 7 participants receiving virtually no L2 input.

Relative proficiency
The results of the verbal fluency task are presented in Table 3. On average, the children produced 1.97 words more in Japanese than in English at Time 1 (SD = 0.93) and 2.11 words more in Japanese than in English at Time 2 (SD = 0.78). Running a linear mixed effects model with Language and Time as fixed effects and subject as random intercept revealed a main effect of Time (E = 0.93, t = 2.39, p = .01) and Language (E = 1.87, t = 3.43, p < .001) but no significant interaction between Time and Language (E = -.29, t = -.05, p = .58). This suggests that returnee children increased their verbal fluency performance over time and generally performed better in Japanese than in English (regardless of the language environment), however, there was no difference in the proportion of increase in verbal fluency performance between English and Japanese.
The mean reaction time (RT), standard deviation, and accuracy for each trial type (Simon) and level (N-back), as well as the Simon cost for the Simon task are presented in Table 4.
Before running the models, we ran Spearman's correlations between proficiency and L2 exposure in order to ensure that these variables were not strongly correlated and therefore did not interfere with issues of collinearity. There was no significant correlation between proficiency and L2 exposure (r = 0.09, p = .58) -which implies that there is no strong relationship between relative proficiency and how much reduction in L2 exposure the returnees experienced over time.
Simon. The summary of the models in both reaction time and accuracy for the Simon task is presented in Table 5. There was a significant effect of Trial type for both reaction time (E = -.04, t = −7.44 p < .001) and accuracy (E = 0.20, z = 3.05, p = .002), suggesting that incongruent trials were slower and less accurate than congruent trials, yielding in a significant Simon cost. However, there was a main effect of Time for reaction time only (E = 0.10, t = 16.89, p < .001), and there was no interaction between Time and Trial type for either reaction time (E = -.0003, t = -.05, p = .95) or accuracy (E = -.09, t = -1.40, p = .16), which suggests that there was an improvement in overall response time, but not in the Simon cost. We were most interested in the estimates of the threeway interaction between Time, Trial type, and Proficiency/L2 Exposure, as this informs us about whether the change in Simon cost was modulated by proficiency and/or L2 exposure. For both reaction time (p's > 0.80) and accuracy (p's > 0.66), no significant three-way interactions were observed, suggesting that proficiency and L2 exposure (as well as age) did not affect changes in the magnitude of Simon cost over time.

Table 3
Summary of verbal fluency performance split by language and time; Time 1 indicates onset of return to Japan and Time 2 indicates a year after the children returned to Japan. EF development and the effect of L2 exposure and proficiency.

English (L2)
Japanese ( Kubota, et al. Journal of Neurolinguistics 55 (2020) 100906 N-back. The summary of the models for both reaction time and accuracy for the N-back task is presented in Table 6. There was a main effect of Time for reaction time (E = 0.06, t = 4.39, p < .001) but not for accuracy (E = -.001, z = -.27, p = .78), indicating that children became faster over time but did not change in terms of accuracy. However, there was a significant interaction between L2 exposure and Time for both reaction time (E = -.03, t = −2.12, p = .03) and accuracy (E = 0.10, z = 2.57, p = .03). This means that changes in reaction time and accuracy were modulated by continued L2 exposure. As shown in Fig. 1, the greater the amount of reduction in L2 exposure the children experienced, the smaller their improvement was in terms of reaction time. In addition, we split the data by Time, in order to examine the simple regression effect of L2 exposure on reaction time at each time point. While the effect of L2 exposure was significant for Time 2 (E = 0.06, t = 2.89, p = .003), it was not for Time 1 (E = -.02, t = -.89, p = .37), crucially suggesting that the amount of reduction in L2 exposure affected children's N-back performance only after children have been immersed in the L1 environment for a year. Similar results were obtained for accuracy as indicated in Fig. 2-the greater the amount of reduction in L2 exposure children experienced, the smaller their improvement was in terms of accuracy. Simple regression analysis (split by Time) also showed that the effect of L2 exposure was significant for Time 2 (E = -.01, t = −2.18, p = .02), but not for Time 1 (E = 0.004, t = 0.72, p = .47). We also found a significant interaction between Age and Time for accuracy, indicating that younger children increased their accuracy to a greater extent than older children. *p < 0.05. **p < 0.01. ***p < .001.

Discussion
The aim of the current study was to identify the effects of proficiency and language exposure on children's development of EF (inhibition and updating/monitoring) in a specific context, where naturalistic L2 contact from immersion is interrupted because of changes in the language environment due to returning to one's home country. Our findings show that the amount of reduction in L2 exposure-from when the children lived in a L2 majority language context to when they returned to L1 environment-contributed to their development in updating and monitoring abilities. That is, children who continued to receive L2 exposure (relative to the amount they received in the prior L2 majority language context), experienced the greatest improvement in updating and monitoring performance over time. This finding is corroborated for both reaction time and accuracy on the N-back task-smaller amount of reduction in L2 exposure contributed to faster performance and higher accuracy upon a year of immersion in the L1 environment. As Table 6 Estimated coefficients, standard errors, t values (for reaction time), z values (for accuracy) from the models for reaction time and accuracy on the Nback task.  M. Kubota, et al. Journal of Neurolinguistics 55 (2020) 100906 illustrated in Figs. 1 and 2, children who experienced around 55% of reduction in L2 exposure for reaction time, and 40% of reduction in L2 exposure for accuracy, appear to show no improvements over time. This suggests that children who experienced more than 40% of reduction in L2 exposure may no longer be able to benefit from the EF boost provided by bilingualism (at least in the context of updating and monitoring abilities). Furthermore, these data fit nicely within current debates and discussions related to so-called advantageous effects of bilingualism on EF, shifting the onus of claims from binary "if" or "if not" (e.g., Lehtonen et al., 2018;Paap, Johnson, & Sawi, 2015) to "under what conditions" of specific bilingual experiences (e.g., Bialystok, 2016Bialystok, , 2017. Our data suggest that more and less active L2 exposure, related in turn to more and less active engagement with bilingualism for the purpose of maintaining it as a cognitive state in the returnee scenario, correlates to individual differences. Bilingual effects on EF are not a de facto given, but rather reflect the active tension and resource allocation required by bilingualism, as shown by our data with respect to development of EFs themselves. L2 exposure, however, did not modulate the change in inhibition ability (reflected by both reaction time and accuracy). This result is consistent with the findings from Purić et al. (2017), which showed that children who were exposed to 5 hours of L2 exposure for a year outperformed monolinguals on working memory, but not on inhibition and shifting abilities. The authors explain that the processes of continuously monitoring the vocabulary and syntactic structures as well as storing and updating this information may be especially taxing during the initial period of second-language acquisition, as it is the case for early language acquisition in monolinguals. In a similar vein, Nicolay and Poncelet (2013) found that children who were enrolled in an immersion school for three years with around 50%-75% of L2 exposure outperformed their monolingual peers on tasks assessing alerting, auditory selective attention, divided attention, and mental flexibility, but not interference inhibition. Taken together, previous findings on early L2 immersion education, along with our results, give rise to the possibility that monitoring and updating abilities may be a facet of EF that is not only sensitive to initial phases of increased exposure to the L2, but is also vulnerable to immediate effects of loss or reduction in L2 exposure. In fact, the findings in Purić et al.'s study converge nicely with what we found in terms of the amount of reduction in L2 exposure-the children in Purić et al.'s study who showed working memory advantage received 5 hours of L2 exposure every day for a year, which equals to around 33% of L2 exposure during waking hours (assuming that an average child sleeps for 9 h = 15 waking hours). This means that receiving 33% of L2 exposure for a year contributed to enhanced monitoring and updating abilities, while our results show the flipped effect-if the children's L2 exposure is reduced by 40% or more over the course of a year, that is when we see a stagnation in their advancement of updating and monitoring abilities.

Reaction time
Moreover, Purić et al. (2017) found a bilingual effect for overall reaction time on the inhibition task (Stroop), but not for differential reaction time, which is the primary measure of inhibition. Similar results are obtained in our longitudinal study-children became generally faster at responding, but the Simon cost did not improve over time. Although there is no clear consensus in the literature as to what the cognitive sources of overall reaction time in non-linguistic interference tasks are, the most recent proposal is that it reflects the conflict-monitoring system, as tasks that involve mixing trials of different types require constant monitoring of trials that require conflict resolution and others that do not (Costa, Hernández, Costa-Faidella, & Sebastián-Gallés, 2009). In their systematic review, Hilchey and Klein (2011) found that bilingual effects are more robustly observed in overall reaction time than in interference effects. This overall reaction time advantage was demonstrated across all age groups, but specifically in child populations (Bialystok, Martin, & Viswanathan, 2005;De Cat, Gusnanto, & Serratrice, 2018;Martin-Rhee & Bialystok, 2008). Following this, it appears to be the case that monitoring ability-as indicated by reaction time and accuracy on the N-back task and overall reaction M. Kubota, et al. Journal of Neurolinguistics 55 (2020) 100906 time on the Simon task -is a specific dimension of EF that improved significantly in our study with older children. It has been suggested that inhibition and shifting abilities undergo rapid development in young childhood, while performance in updating and monitoring follows a more protracted development until young adulthood (Best & Miller, 2010). Indeed, our findings demonstrate no effects of age on the development of a Simon cost in the inhibition task, while age modulated the development of monitoring and updating abilities (in reaction time) on the N-back task. Specifically, younger children improved their monitoring and updating performance to a greater extent than older children. Thus, we may have only observed an increase in monitoring and updating performance because school age years are more characterized by improvements in working memory, monitoring, and updating, while inhibition generally improves dramatically around pre-school years from the ages of 5-8 (Romine & Reynolds, 2005). Contrary to our prediction, relative proficiency at the onset of return did not influence children's development in inhibition ability. Although the absence of this effect may be attributed to the lack of improvement (and thus variability) in Simon cost over time, it may also reflect the fact that very few children were prominently dominant in English. In our study, relative proficiency was measured as the difference between L1 Japanese and L2 English scores on the verbal fluency task, and therefore was treated as a continuous variable. However, if we follow an alternative, categorical classification of language dominance (Bernardini & Schlyter, 2004;Unsworth, 2003), then the children would be categorized as dominant in Japanese or English when there is a difference greater than 1 standard deviation between the two scores. When applying this classification system to our data, it turns out that 4 children were English-dominant, 13 children were Japanese-dominant, and 19 children were balanced bilinguals. Given that only 4 children showed clear dominance in English over Japanese proficiency, our data may not have been optimal to detect any effect of relative proficiency on children's development in inhibition ability. Since most of the children in the current study were either Japanesedominant or balanced bilinguals at the onset of return to the Japanese environment, inhibiting their less-dominant language, English (in the case of Japanese dominant bilinguals) or equally proficient language (in the case of balanced bilinguals) may not have been cognitively challenging enough to confer a boost in EF performance. Future studies may need to take this factor into account and test returnee children with wide ranges of relative proficiency.
Moreover, as previous work suggests that the verbal fluency task (used to measure relative proficiency) is not only characterized by linguistic representations, but it also involves recruitment of executive function (Shao, Janse, Visser, & Meyer, 2014), this may not have been the most optimal task to measure the children's language proficiency. Although we have information about their L2 proficiency using general reading, writing, and listening and speaking tests, this data is only available in their L2 English, due to feasibility issues. However, the influence of EF skills on verbal fluency performance should be the same for both Japanese and English versions of the task. Therefore, even though verbal fluency is not best measure of absolute proficiency in one language, the task should adequately capture the difference in proficiency between L1 and L2. That said, we suggest that future studies-especially ones that examine the effects of bilingual experience on executive function-use other tasks (such as Peabody Picture Vocabulary Test or Clinical Evaluation of Language Fundamentals) to measure children's relative proficiency.

Conclusion
The aim of the current study was to longitudinally examine whether two aspects of bilingual experience-language exposure and proficiency-influence EF development in a specific bilingual population, namely, returnee children. The children who experienced large amounts of reduction in L2 exposure were also the ones that showed small or even no improvements in monitoring and updating abilities over the course of a year. We must emphasize here, however, that children's EF performance generally increased with age/ time, but what is interesting is that the rate of this improvement was affected by L2 exposure. Our findings are among the first to show that losing access to one's L2 in childhood has consequences for EF development, just like how learning another language may promote enhancement of EF. Future research should look into whether the same results could be obtained for language use, since it is common for returnee children to receive some exposure in their L2 (e.g., through watching TV or browsing the internet), but that does not necessarily mean that they actively speak the L2 to the same extent when returning to the L1 environment (Unsworth et al., 2018).