Orthographic learning, fast and slow: Lexical competition effects reveal the time course of word learning in developing readers

Children learn new words via their everyday reading experience but little is known about how this learn- ing happens. We addressed this by focusing on the conditions needed for new words to become familiar to children, drawing a distinction between lexical conﬁguration (the acquisition of word knowledge) and lexical engagement (the emergence of interactive processes between newly learned words and existing words). In Experiment 1, 9–11-year-olds saw unfamiliar words in one of two storybook conditions, dif- fering in degree of focus on the new words but matched for frequency of exposure. Children showed good learning of the novel words in terms of both conﬁguration (form and meaning) and engagement (lexical competition). A frequency manipulation under incidental learning conditions in Experiment 2 revealed different time-courses of learning: a fast lexical conﬁguration process, indexed by explicit knowledge, and a slower lexicalization process, indexed by lexical competition. 2017 Authors. Published Elsevier B.V. creativecommons.org/licenses/by/4.0/).


Introduction
Skilled reading is underpinned by a word recognition system that is fast, efficient and effective. How do children develop such a system and what factors influence children's transition from novice to expert? These questions have been identified under the rubric of orthographic learning -the term used to describe the processes required for a reading system to move from one that is heavily reliant on alphabetic decoding to one that resembles skilled word recognition -but relatively little is known about how orthographic learning happens (Nation & Castles, in press).
We take a novel approach to understanding orthographic learning, focusing on the conditions needed for new words to move from unfamiliar to familiar. The self-teaching hypothesis (Share, 1995(Share, , 2008 describes how exposure to a new word offers an opportunity to commit its spelling to memory so that on subsequent encounters, it is recognized more efficiently. Phonological decoding is key as new words are learned via children's own decoding attempts. While plenty of evidence supports this notion, correlations between phonological decoding and orthographic learning are not perfect, indicating that other factors have to be involved (Nation, Angell, & Castles, 2007).
Before we turn our attention to this, a more fundamental issue needs to be discussed. What does it mean to have learned a word? From the spoken word recognition literature, it is clear that word learning evolves over time. Leach and Samuel (2007) proposed an important distinction between lexical configuration, defined as the fast acquisition of word knowledge via episodic memories, and lexical engagement, the slower emergence of interactive processes between newly learned words and existing words. This distinction is also captured by the Complementary Learning Systems account (Davis & Gaskell, 2009;McClelland, McNaughton, & O'Reilly, 1995). Initial exposure to novel words results in episodic representations in the hippocampal memory system, underpinning lexical configuration. Through consolidation, these representations are integrated into the non-episodic neocortical memory network, allowing lexical engagement. Only then can they be considered fully lexicalized. Experiments with adults (Gaskell & Dumay, 2003) and children (Henderson, Weighall, & Gaskell, 2013) support both configuration and engagement as phases in learning new spoken words. Research on orthographic learning in children, by contrast, has focused on lexical configuration only, with learning treated as all or none. This is reflected in the methods used to measure learning, such as spelling or orthographic choice Wang, Castles, Nickels, & Nation, 2011). As argued above, however, measures of lexical configuration might tap only an ini-tial phase of orthographic learning. Investigating later phases of learning requires sensitive measures that tap lexical engagement, indicating the extent to which new words have become integrated with known words.
A potential marker for lexical engagement processes in orthographic learning is the emergence of a prime-lexicality effect. In lexical decision with masked priming, form-related primes that are nonwords (anple) facilitate target responses (APPLE). In contrast, form-related primes that are words (ample-APPLE) do not facilitate target responses, with inhibition sometimes (but not always) being observed (Davis & Lupker, 2006;Forster & Veres, 1998). Put simply, words and nonwords can be distinguished by their different priming signatures. The general explanation for this is that word primes activate their lexical representations which then compete with the target words; by definition, nonwords are not in the lexicon and therefore cannot compete. Instead, overlap in orthographic form leads to facilitation. Qiao, Forster, and Witzel (2009) reasoned that as novel words become familiar (via training), there should be a switch in priming: as a nonword is learned, it should begin to behave like a word, competing with orthographically similar words. If this can be captured experimentally via a reduction in facilitation, it will provide a measure of orthographic learning that corresponds to lexical engagement.
The emergence of a prime-lexicality effect has not been explored in children, but relevant evidence from adults was reported by Qiao et al. (2009). They trained adults on a set of nonsense words by asking them to repeatedly type them. Training did not reduce facilitation, suggesting that lexical engagement had not been achieved. Qiao and Forster (2013, Experiment 1) provided explicit meanings for the nonwords, but the trained nonwords remained non-lexicalised. In a second experiment, training with meanings was extended over four sessions. This time, the trained nonwords no longer facilitated responses to target neighbors, showing that lexical competition had been induced. Taken together, these findings from adults suggest that extensive and elaborate exposure is needed to shift written nonwords to words when lexical engagement is tapped.
Clearly, these artificial training experiments are far removed from children's everyday reading experience. Yet, independent reading is rife with new words: after reading Harry Potter, children know something about quidditch, expelliarmus and hippogriff. Our aim here is to investigate what children learn from exposure to unfamiliar (but real) English words via naturalistic reading, and what the time course is. We used the prime-lexicality effect as a measure of lexical engagement, in addition to traditional measures of lexical configuration. We focused on typically-developing 10year-olds. By this age, children are able to read fluently and are no longer relying heavily on overt phonological decoding. Importantly, previous work has established that reliable data can be collected from lexical decision paradigms, and that standard orthographic form-related priming effects are seen in children of this age (e.g., Castles, Davis, Cavalot, & Forster, 2007). At the same time, we know that the reading system is still developing in late childhood and through early adolescence (e.g., Ben-Shachar, Dougherty, Deutsch, & Wandell, 2011). Establishing the nature of written word learning in children of this age is an important step to help us understand how the reading system makes the transition from novice to expert.
Training studies allow for the experimental manipulation of specific factors to investigate their effects on both lexical configuration and engagement. In Experiment 1, we compared reading stories aloud with reading stories aloud with additional attention paid to the novel words. This allowed us to test whether elaborate, directed exposure is necessary to drive lexical engagement, as seems to be the case for adults (Qiao & Forster, 2013). Having established the utility of the prime-lexicality effect, Experiment 2 manipulated the number of exposures to investigate the time course of learning via naturalistic reading, without directed instruction.

Experiment 1
Experiment 1 investigated whether the prime-lexicality effect could be used as a marker of children's learning of unfamiliar words encountered via storybook reading. Children learn a large proportion of new words incidentally, via independent reading (Nagy, Herman, & Anderson, 1985). In terms of lexical configuration, orthographic learning following this type of exposure has been demonstrated (e.g., Share, 2008). Importantly however, incidental exposure via reading experience is in sharp contrast to the conditions needed to bring about lexical engagement in experiments with adults, where directed training, such as repeated typing of spellings (Bowers, Davis, & Hanley, 2005) or extended practice with the meaning of the to-be-learned words (Qiao & Forster, 2013) has been used. And, there was been only one demonstration of the prime-lexicality effect following exposure to new written words - Qiao and Forster's (2013) directed instruction experiment with adults. Building on their findings, our first experiment investigated whether developing readers show a prime-lexicality effect following exposure to new written words, and whether directed instruction is necessary for this marker of lexical engagement to emerge.
To do this, we compared learning when children's attention was drawn to unfamiliar words, relative to experiencing them with less direction. Children in the reading-only condition read aloud stories containing unfamiliar, low-frequency English words over three exposure sessions. Children in the reading-plus condition read aloud the same stories as in the reading-only condition, with additional practice of spellings and meanings analogous to the methods used in adult experiments (Bowers et al., 2005;Qiao & Forster, 2013). The total number of exposures to each unfamiliar word was constant across conditions. Prior to exposure, unfamiliar words should exhibit nonword-like facilitatory priming. After exposure, the absence of facilitation or the emergence of inhibition would indicate lexical competition between newly-learned words and their neighbors. We predicted greater learning, and therefore a larger prime-lexicality effect, in the reading-plus condition.

Design
Exposure condition (reading-only vs. reading-plus) was manipulated between-subjects. The experiment comprised three phases -pre-exposure, exposure phase, and post-exposure. Learning of the unfamiliar words was assessed post-exposure both in terms of explicit knowledge of their form and meaning, and with regard to their lexicalization. Explicit knowledge of form was assessed using orthographic decision and spelling, reflecting recognition and recall of the unfamiliar word forms. Knowledge of meanings was assessed via definition knowledge. Lexicalization was assessed via the emergence of a prime-lexicality effect. This was measured by comparing performance on a masked priming lexical decision before and after exposure to the unfamiliar words, yielding a within-subject factor of test session (pre-exposure vs. postexposure). Spelling, orthographic decision, and definition knowledge data were collected once, after exposure.

Participants
Thirty children (17 female) aged 9-11 years (M = 10.13, SD = 0.72) participated. We selected this age as by then, most children have mastered the basics of reading and are engaging in independent reading. All were monolingual native English speakers scoring no more than 1SD below the standardized average on the Test of Word Reading Efficiency (TOWRE; Torgesen, Wagner, & Rashotte, 1999). Sample size was informed by previous investigations of orthographic learning (e.g., Wang et al., 2011) and masked priming lexical decision in children (e.g., Perea, Jiménez, & Gómez, 2015). Data from one child were excluded because the child did not complete all sessions. The final sample consisted of 15 children in the reading-only group and 14 children in the reading-plus group. The children performed towards the upper end of average range on the TOWRE (M = 111.72, SD = 13.68) and the two groups did not differ in reading ability or age. The research was approved by the Central University Research Ethics Committee.

Materials and procedure
Twenty word pairs (see Appendix A) comprising an unfamiliar word and its orthographic neighbor base word were selected. Each pair met the following criteria: (a) each unfamiliar word was relatively low in frequency (CELEX written frequency M = 1.21, SD = 1.52; five items not contained in CELEX were counted as zero), e.g., wield; (b) each base word was relatively high in frequency (CELEX written frequency M = 190.53, SD = 516.17), e.g., field; (c) neither item had further orthographic neighbors likely to be known by children, differing by a single letter substitution in the CELEX database (0-2 additional neighbors, M = 0.23 (CELEX written frequency M = 3.77, SD = 5.84), e.g., yield.
Pilot testing confirmed that children of this age knew the form and meaning of the base words but had minimal knowledge of the unfamiliar words. To provide a baseline against which to compare spelling performance in the experiment, an additional sample of 20 children were asked to spell all 20 unfamiliar words, without having seen them in an exposure phase. These children were slightly older (M = 11.06, SD = 0.33) than the children in the experiment, but similar in reading ability (M = 118, SD = 11.74). Each spelling attempt was scored as correct or incorrect and as anticipated, performance was low (M = 0.30, SD = 0.14) indicating poor knowledge of the correct orthographic forms by children of this age.
For counterbalancing purposes, the 20 pairs of stimuli were divided into two lists of 10 pairs. Children were allocated to either list based on the order in which they participated in the experiment.
2.1.3.1. Pre-exposure phase. Due to scheduling constraints in school, some children completed this task on the same day as the start of the exposure phase, whereas others completed them on a separate day. All computerized tasks in the experiment were presented on a Dell Latitude E6400 laptop controlled by E-Prime 2.0 software (Psychology Software Tools, 2012). 2.1.3.1.1. Masked priming lexical decision. This established the baseline for the prime-lexicality effect. Each trial comprised a fixation cross shown for 500 ms followed by a forward mask (#########) displayed for 500 ms. The prime was shown in lowercase letters for approximately 57 ms, followed by the target in uppercase letters. The target was presented until children responded with a button press. Children were asked to indicate whether words were real or made-up. ''Real word" responses were indicated by pressing the ''m" key on the keyboard, ''made-up word" responses by pressing the ''z" key.
Children made lexical decisions to 40 targets: 20 words and 20 nonwords. Example stimuli are shown in Table 1. Each target was primed by a masked word. For each child, 10 of the target words were the base words from their allocated list; the other 10 words were base words from the counterbalanced list. The 10 base words from the child's allocated list were primed by their unfamiliar word neighbors (i.e., the words which the child would subsequently experience in the exposure phase). The 10 base words from the counterbalanced list were primed by 10 orthographically unrelated low frequency control words, defined as more-than-one letter different to the experimental words. The 20 nonwords each differed from an English word by one letter. Half of the nonwords were primed by orthographically related, low-frequency word primes (orthographic neighbors), and half were primed by orthographically unrelated, low-frequency word primes. Following a practice block of eight trials, the experiment itself started with four dummy trials before the 40 items were presented in a random order.
2.1.3.2. Exposure phase. This ran across three sessions completed on separate days. The number of days across which children completed the exposure phase varied between three and nine days (M = 5.62, SD = 1.06). This did not differ significantly between the two groups. In both conditions, children received a total of 18 exposures to each unfamiliar word during the exposure phase. 2.1.3.2.1. Story contexts. Twenty story contexts were created -one per unfamiliar word. To create a coherent narrative akin to chapters in a book, the stories revolved around two children who had to find and free words stolen by an evil professor. Each story involved a different hiding place where the protagonists had to ''crack a code word" by finding out about the meaning of the unfamiliar word embedded in the story. Each story referred to the meaning of the unfamiliar word in six different ways -one dictionary-style definition (e.g., ''to wield a weapon means to hold it out ready to attack"), three sentences conveying additional semantic information (e.g., ''a knight might draw his sword and wield it at a dragon") and two further mentions without additional semantic cues (e.g., ''I see you understand what wield means"). Chapters were laid out on A5 story cards modeled on the typeset of age-appropriate novels, which contained a colored illustration of the unfamiliar word that fitted with the story. Chapters were matched on length (M = 128 words). An example story is provided in Appendix A. Children were randomly allocated to either the reading-only or the reading-plus condition. 2.1.3.2.2. Reading-only condition. In session one, children were told they would be reading a story printed on several cards similar to chapters in a book. They were given a short prologue to read which explained the premise of the stories. They then shuffled the cards face down to determine the order in which stories were read. They were asked to then read each story aloud to the experimenter. At the end, they read a short epilogue. In session two, children read the stories aloud as in session one in an order of their own choosing, but without prologue and epilogue. To add variety, in session three children were given a map showing the locations used in the story cards. They were asked to decide on a route for the protagonists to take by connecting the locations on the map with a line. They then read aloud the story cards in this order. Throughout, any unfamiliar words read incorrectly were corrected. 2.1.3.2.3. Reading-plus condition. In session one, children read the stories as in the reading-only condition. In session two, after reading each story in the order of their choice, they were asked to write down each unfamiliar word on a card and to provide their own def- Table 1 Example stimuli for the masked priming lexical decision task.

Prime Target
Related target-word trial wield FIELD Unrelated target-word trial neigh ABOUT Related target-nonword trial twine TWIDE Unrelated target-nonword trial sprit GHARE inition underneath. Children could refer back to the story cards. In session three, after completing the map, children were given a modified version of the story cards where all the unfamiliar words were missing and needed to be filled in. They were asked if they could remember the code word from the story based on the picture and the story context. If they could not remember it, the experimenter provided its spoken form. The experimenter then wrote the code word down on a separate sheet to remind them of the correct spelling. Children were then asked to copy down the target words into all the gaps and they then read the completed card aloud. These procedures led to an extra exposure to each unfamiliar word in sessions one and two; to compensate for this (and thereby maintain frequency across conditions), each story card contained five mentions of the unfamiliar word rather than six. Throughout, any unfamiliar words read incorrectly were corrected.  (2013), this assessed how well children learned the forms of the unfamiliar words. A nonword homophone distractor was created for each unfamiliar word (e.g., for wield, we constructed whield) as well as two nonword distractors differing by one consonant (e.g., wielk, wielp). While care was taken to maintain length, some homophone distractors needed to differ in length from the unfamiliar words. Items were presented individually one at a time and children were asked to indicate whether or not the shown word was one of the code words from the story, correctly spelled. As with lexical decision, yes responses were indicated by the ''m" key on the keyboard and no responses indicated by the ''z" key. Children made decisions to the 10 unfamiliar words from their allocated list and 30 distractors. For each unfamiliar word, performance was coded as accurate if a yes response was made to the correct spelling and all three incorrect spellings were rejected. 2.1.3.3.4. Spelling. The 10 unfamiliar words were dictated from the appropriate counterbalanced list in a random order and children were asked to write them down. Responses were scored as correct or incorrect.

Results
Linear mixed effects models with maximal model structure (see Barr, Levy, Scheepers, & Tily, 2013) were computed using the lme4 package (version 1.1-7; Bates, Maechler, Bolker, & Walker, 2015) in R (version 3.1.0; R Core Team, 2014). By-subject and by-item random intercepts and random slopes were included for each fixed main effect and interaction, based on recommendations by Barr et al. (2013). Fixed effects were centered around their mean to minimize collinearity. Models which failed to converge with maximal structure were simplified by removing random effects explaining the least amount of variance in the outcome variable, or by removing random correlations. For models with continuous outcome variables, significance was determined using the cut-off point of t > 2. For models with binary outcome variables, mixed effects logistic regression models were computed. Here, the cut-off for significance was p < .05. To control for individual variation in word reading ability, children's raw score on the TOWRE was included as a covariate. Fig. 1 shows that children learned the unfamiliar words well, both in terms of form (orthographic decision and spelling) and meaning (definitions). For reference, spelling data from the additional group of children who did not see the words (see Method) are also plotted. Children in both exposure conditions were better able to spell the words than those at baseline (reading-only: t = À11.78, p < .001; reading-plus, t = À12.14, p < .001) confirming that exposure increased spelling accuracy.

Word learning as indexed by lexical configuration
To compare performance between exposure conditions, mixed effects logistic regression models were fit separately for each task, with exposure (reading-only vs. reading-plus) entered as fixed factor. The results of the statistical models are shown in Table 2. Learning was strong and broadly equivalent across both readingonly and reading-plus conditions, with better spelling performance in the reading-plus condition (orthographic decision, p = .513; spelling, p = .021; definitions, p = .294). TOWRE was a significant covariate in the analyses of orthographic decision (p = .023), indicating better performance for children with higher TOWRE scores than lower scores. For spelling and definition knowledge, TOWRE was not a significant covariate.

Word learning as indexed by lexical engagement
Due to an error in the experimental script, 18 trials (1.58%) were excluded from the lexical decision data. Data for the pre-exposure session were missing from one child in the reading-plus condition. Accuracy was high with errors made on only 6.77% of trials. Only correct trials were analyzed. Based on visual inspection of the qq-plot, RTs were inverse transformed to reduce positive skew, as recommended by Baayen and Milin (2010) for lexical decision data, 1 and one data point was removed. Descriptive data showing the results per exposure condition are shown in Fig. 2 where raw RT data are plotted for ease of interpretation.
A linear mixed effects model was fit to the data with primetarget relationship (orthographically related vs. unrelated), test session (pre-vs. post-exposure) and exposure condition (readingonly vs. reading-plus) as fixed factors. The results of the statistical model are shown in Table 3. TOWRE was a significant covariate (t = À3.29) indicating that children with better word reading ability were faster overall. Neither the main effects of exposure condition (M (reading-only) = 1093, M (reading-plus) = 1169, t = 0.19) nor test session (M (pre-exposure) = 1123, M (post-exposure) = 1134, t = À0.07) were significant. There was a significant main effect of prime-target relationship (M (unrelated) = 1081, M (related) = 1177, t = 2.16) indicating slower responses to targets preceded by related primes than unrelated primes. Importantly, this main effect was qualified by a significant prime-target relationship Â test session interaction (shown in Fig. 3, t = 3.92) signalling a switch in the direction of priming effects following exposure. Before exposure, there was no difference between targets preceded by unrelated primes compared to related primes (t = À0.97), although numerically, RTs were shorter following related primes than unrelated primes, indicating facilitation. In contrast, after exposure children were significantly slower to respond to targets primed by related unfamiliar words than unrelated words (t = 4.03), denoting inhibition. No further interactions were significant and there was no difference in the magnitude or 1 Using log transformation instead did not change the pattern of results in either experiment.
pattern of priming between reading-only and reading-plus conditions.

Discussion
Children showed good levels of orthographic learning across all three measures of lexical configuration. Performance was high in both the reading-only and the reading-plus conditions, with a small additional benefit found for spelling in the reading-plus condition. A limitation to note is that definition knowledge was scored by the experimenter who was not blind to condition, an issue addressed in Experiment 2. After exposure, newly-learned words acquired via story reading also competed with their neighbors. This is the first demonstration of lexical engagement in children's reading.
Surprisingly, engagement was similar across conditions: directed exposure in the reading-plus condition did not benefit lexicalization over and above reading stories alone. It is possible that with a larger sample of children (or items) and increased statistical power, reliable differences between the exposure conditions might emerge. We return to the issue of sample size in the General Discussion. In the meantime, it is worth noting that along with the null effect of exposure condition, numerically more inhibition was seen post-exposure for words acquired in the reading-only condition, not the reading-plus condition. This suggests that directed or focused instruction is not needed to induce lexical engagement of written words in children of this age. Instead, exposure to novel words in a more natural reading situation, and without explicit activities directing attention to the new words, seems sufficient for word learning, in terms of both lexical configuration and lexical engagement. That being said, our materials in the reading-only condition were artificial in the sense that explicit definitions and semantic referents were provided. Of the six exposures, one provided a dictionary-style definition (e.g., ''to wield a weapon means to hold it out ready to attack") and three further sentences conveyed additional semantic information (e.g., ''a knight might draw his sword and wield it at a dragon"). These cues to the meaning of the new word mean that learning was quite directed and explicit, even in the reading-only condition. Thus, the nature of lexical configuration and lexical engagement following less constrained exposure to novel words in reading experience remains an open question.
We build on these findings in Experiment 2 by investigating incidental learning, where explicit reference to word learning is minimized. This was achieved by removing explicit definitions for the unfamiliar words and increasing the length of the story contexts (cf. Swanborn & de Glopper, 1999). We also focused on the time course of learning. Children encountered unfamiliar words in a single session with four exposures or in three sessions each containing four exposures, resulting in 12 exposures in total. Given previous findings , we anticipated fast learning in terms of lexical configuration, even in the minimal exposure condition. We predicted a slower time course for lexical engagement, with more exposure needed before the newly learned words induced competition.

Design
Like Experiment 1, Experiment 2 involved pre-exposure, exposure, and post-exposure phases. The exposure phase included a between-subjects manipulation of number of exposures (four exposures vs. twelve exposures). Both conditions involved incidental exposure via story reading, with explicit reference to new words and their meanings minimized. We used the same 20 unfamiliar-base word pairs as before, and the same lexical decision task was administered before and after exposure to track the prime-lexicality effect. The task was presented on a Dell Latitude E6530 laptop and a Serial Response Box (Psychology Software Tools, Inc) was used to to record responses. Explicit knowledge of form and meaning was assessed as per Experiment 1, using orthographic decision, spelling, and definition knowledge.

Participants
Twenty different children (10 female) aged 9-11 years (M = 10.42, SD = 0.82) participated, following the same recruitment  and selection procedures as before. The children performed towards the upper end of average range on the TOWRE (M = 110.60, SD = 9.11) and the two groups did not differ in reading ability or age. Ten were allocated to each group and the two groups did not differ in age or reading ability.

Materials and procedure
We created 20 new stories, each containing four repetitions of one unfamiliar word. The stories were matched in length across conditions (M: 162 words). Unlike Experiment 1, no explicit definitions were provided, although cues to word meaning could be inferred from the context. There was no cover story alluding to new words. The stories were presented as short child-friendly chapters as before, but without any pictures. An example story is provided in Appendix A.
3.1.3.1. Twelve-exposures condition. This involved three sessions spread over different days with four exposures per unfamiliar word per session (12 exposures in total). The number of days across which children completed the exposure phase varied between three and 16 days (M = 6.7, SD = 4.57). In session one, they were asked to read each story aloud, in random order. Children were told they would be reading a story printed on several cards, similar to chapters in a book. After half of the stories, there was a short break and the children played a nonverbal game. Sessions two and three were identical. Any errors were corrected by the experimenter.

Four-exposures condition.
This involved a single session with four exposures per unfamiliar word (4 exposures in total), identical to the first session in the twelve-exposures condition.
Post-exposure, the same assessments were carried out as in Experiment 1, with two minor amendments. For definition knowledge, each word was presented one at a time printed on a card rather than on a computer screen. Responses were doublescored, by the first author and by an independent blind rater. Interrater-reliability was very high (Cohen's kappa: 0.93) and therefore we used the blind ratings in the analysis. Orthographic decision was assessed by asking children to sort words into two stacks of words, those with correct spellings and those with incorrect spellings. This procedure was chosen to prevent children from rushing through the task without paying close attention. Children were tested between one and five days after completing the exposure phase (M = 1.85, SD = 1.39). This did not differ significantly between groups. As the to-be-learned words were the same as those used before, the baseline spelling data from the untrained children in Experiment 1 continued to serve as the baseline here. Fig. 4 shows that children in the twelve-exposure condition learned the unfamiliar words well, both in terms of form (orthographic decision and spelling) and meaning (definitions). As in Experiment 1, spelling accuracy was considerably higher for children in the experiment, relative to those who contributed baseline data, approaching ceiling in both conditions. While form was learned well in the minimal exposure group, meaning was not. To compare performance between exposure conditions, mixed effects logistic regression models were fit separately for each task, with exposure (four vs. twelve) entered as fixed factor. The results of the statistical models are shown in Table 4. For orthographic decision, data were missing from one child in the four-exposures condition. There was no difference in performance as a function of exposure (p = .261). Similarly, there was no difference in spelling (p = .575), although TOWRE was a significant covariate (p = .042), indicating better spelling performance for children with higher  Table 3 Results of the models examining inverse transformed RT in the masked priming lexical decision task in Experiment 1.

Model and predictor
Estimate ( The maximal model did not converge. To simplify, the correlation between the random intercept for item and the by-item random slopes was removed. The difference in explained variance between the maximal model and the simplified final model was not significant. TOWRE scores. Although there was no difference between conditions for learning of form, children in the twelve-exposure group performed much better on the definitions task (p < .001). TOWRE score was not a significant covariate (p = .329).

Word learning as indexed by lexical engagement
Lexical decision performance was high with errors on only 5.34% of trials. Only correct trials were analyzed and due to an error in the experimental script, 36 additional trials (4.5%) were excluded. Based on visual inspection of the qq-plot, inverse RT values were used in the analysis and one data point was removed. Fig. 5 summarizes performance and for ease of interpretation, raw RT data are plotted.
Linear mixed effects models were computed with prime-target relationship (related vs. unrelated), session (pre-vs. postexposure) and exposure condition (four vs. twelve) as fixed factors. The results of the statistical models are shown in Table 5. TOWRE score was a significant covariate (t = À6.23), indicating that children with higher TOWRE scores were faster than less-skilled readers. None of the main effects were significant (prime-target relationship: M (unrelated) = 908, M (related) = 925, t = À0.68; test session: M (pre-exposure) = 924, M (post-exposure) = 910, t = À0.69; exposure condition: M (four-exposures = 953, M (twelve-exposures) = 880, t = 0.71). There was however a significant test session Â exposure condition interaction (t = À2.11), further qualified by a significant three-way interaction between prime-target relationship, test session and exposure condition (t = À2.58). There were no further significant interactions.
The three-way interaction was explored by computing separate models for each exposure condition. As in the main analysis, TOWRE was a significant covariate in both exposure conditions (four, t = À3.66; twelve, t = À6.36). In the four-exposures condition, there was a significant main effect of test session (t = À2.63), indicating faster responses following exposure. The main effect of prime-target relationship was not significant (t = À0.72) and there was no significant interaction. A different pattern of priming emerged in the twelve-exposures condition: neither the main effects of test session (t = 0.85) nor prime-target relationship (t = À0.17) were significant, but the prime-target relationship Â test session interaction was evident (t = 2.52). Pairwise comparisons revealed a marginal effect of prime-target relationship before exposure (t = À1.80), indicating a trend towards faster RTs to targets primed by related than unrelated words. After exposure, the effect of prime-target relationship was not significant (t = 1.20). This mirrored the findings of Experiment 1 in showing a switch in the direction of priming following twelve exposures. No such effect was observed following only four exposures.

Discussion
Following incidental exposure, children in the four-exposures condition showed equally good lexical configuration of form as children in the twelve-exposures condition. In contrast, they  abstracted less about the meaning of the new words. Despite rapid configuration of orthographic form for all children, lexical engagement only emerged for those in the twelve-exposure condition. This replicates the prime-lexicality effect observed in Experiment 1 in a different sample of children, using materials closer to everyday reading experience and without explicit instruction as to the meaning of the words.

General discussion
Our results show that incidental exposure to unfamiliar words in text provides a powerful learning opportunity. In terms of lexical configuration, children were able to spell new words accurately and distinguish them from foils, an effect that replicated across both experiments. Experiment 1 found little additional benefit of directed exposure, and in Experiment 2, good learning of form was evident after just four incidental exposures; abstraction of meaning, in contrast, was better after twelve exposures. Importantly, the journey of a written word from unfamiliar to known cannot be characterized as a single step: more exposures were needed before unfamiliar words competed with existing words. The switch in priming pattern (i.e., reduction in facilitation and the emergence of inhibition) following exposure replicated across experiments and is in line with the theoretical distinction between lexical configuration and lexical engagement as two different types and phases of word learning (Leach & Samuel, 2007). More broadly, we have established the prime-lexicality effect as a promising new measure of lexical engagement processes in children's orthographic learning.
With regard to the nature of lexical engagement, we adopt a lexical competition account to explain the change in direction of priming as words become familiar, consistent with similar theoretical positions in spoken word learning (Gaskell & Dumay, 2003;Henderson et al., 2013). An alternative account is that responses to neighbors of trained words were slowed due to episodic interference (e.g., misreading effects), independent of the presence of related primes. We did not test this possibility directly, as base words of the training set were always primed by trained related primes at post-exposure. This was an active choice, given constraints on the number of words children can experience in an experiment. While further work is needed to specify the nature of learning tapped by the prime-lexicality effect, our finding that four exposures was sufficient for children to learn the new spellings well, yet insufficient to produce interference in lexical decision, argues against a purely episodic account.
Learning was not influenced by the explicitness of instruction. Additional activities focusing on the new words in Experiment 1 did not increase lexical configuration (bar a slight improvement in spelling) or engagement. We adapted the materials in Experiment 2 to tap incidental learning, removing reference to word learning and avoiding explicit definitions in the texts. Learning remained strong, with the twelve-exposures condition showing identical results to Experiment 1. Clearly, children of this age are well-placed to take advantage of the learning opportunities provided by independent reading, even when word learning is not supported by explicit 'dictionary-style' definitions, or activities that direct attention to the spelling or meaning of the new words. This helps us to understand why estimates of how much print a child has experienced is an important predictor of word reading skill (Mol & Bus, 2011) and why reading and vocabulary are strongly associated (e.g., Perfetti, 2007;Ricketts, Nation, & Bishop, 2007).
At this point it is important to note that our sample size is small. Having established that multi-day word learning paradigms can bring about lexical engagement, larger sample sizes are needed to reliably assess what factors may moderate this learning. In Experiment 1 we found no evidence that lexical engagement was enhanced following directed instruction in the reading-plus condition. Given the small sample, it is reassuring to see lexical engagement replicate in Experiment 2. As the training materials in Experiment 2 were more incidental than the reading-only condition in Experiment 1 (e.g., longer texts, no direct definitions provided), this confirms the finding from Experiment 1 that directed instruction is not needed for lexical engagement to emerge: following only twelve incidental exposures, new words competed with existing words. It remains possible however that with increased statistical power, subtle differences between exposure conditions may emerge. Thus, our findings should be extended by future research with larger samples, and with children of varying ages and reading level.
Children were fast to show lexical configuration in Experiment 2, where performance was good for orthographic decision and spelling following both four and twelve exposures. In contrast, more exposure was needed to induce lexical engagement. Although children in the four-exposures condition had the opportunity for sleep-related consolidation, as the post-test was always at least one day after exposure (Bowers et al., 2005;Gaskell & Dumay, 2003), no competition emerged. This might reflect a frequency effect, with simple repetition being key; it might also reflect that children in the twelve-exposure condition experienced their repetitions across multiple sessions, extended over time. Future research should directly compare the number versus spacing of exposures. Another possibility is that a subtle difference in the learning of form between the two exposure conditions was masked by a ceiling effect on spelling and orthographic decision, Table 5 Results of the models examining inverse transformed RT in the masked priming lexical decision task in Experiment 2.

Model and predictor
Estimate ( To simplify, the correlation between the random intercept for item and the by-item random slopes was removed, and the three-way interaction between by-item random slopes. The difference in explained variance between the maximal model and the simplified final model was not significant. and that these differences might account for the differences seen in lexical engagement. Importantly though, the strong performance on spelling and orthographic decision following only four exposures in Experiment 2 did not bring with it evidence that those newly learned forms interacted with existing forms. There were, however, more striking differences in semantic learning as a function of number of exposures. Performance on the definitions task in Experiment 2 patterned with the emergence of lexical engagement, with children in the four-exposures group showing less semantic learning than children in the twelveexposures group. These two observations might be related, if meaning is needed to induce lexical learning of the nature needed to drive the emergence of lexical engagement. While semantic information is thought to contribute to a word's lexical quality (Perfetti, 2007), the role of meaning in orthographic learning is not fully understood in traditional paradigms tapping lexical configuration (Nation & Castles, in press); it has not been explored in tasks that measure lexical engagement in children's written word learning. Research on children's learning of new spoken words suggests that while meaning is not critical for the emergence of competition, longer term retention may be better for words trained with semantics (Henderson et al., 2013). This contrasts with Qiao and Forster's (2013) observation that extended training with explicit focus on meaning is needed for a primelexicality effect to emerge in adults; Leach and Samuel (2007) also observed that training new spoken words with meaning brought about stronger lexical engagement effects in adults. These contrasting findings might reflect a difference between written and spoken language; they might also reflect differences between child and adult learners, with children more ready to accept unfamiliar words as potential words. Like us, Henderson et al. used lowfrequency words unfamiliar to children as novel stimuli. In contrast, Qiao and Forster used items that were clearly nonsense words to the adults. Plausibly, this real world validity might lead to different learning effects. Our data show that children are adept at abstracting meaning from incidental encounters during reading. Whether acquisition of sufficient meaning is critical for the lexicalization of new written forms is an open question.
To conclude, children showed evidence of fast orthographic learning, quickly learning the form of new words from incidental exposure. However, fast learning is only partial learning. On the view that lexical engagement is the hallmark of lexicalization, more exposures were needed for this to emerge, and for meaning to be constructed. Our findings are consistent with models that distinguish between configuration and engagement as phases of word learning.