The relationships between oral language and reading instruction: Evidence from a computational model of reading

Reading acquisition involves learning to associate visual symbols with spoken language. Multiple lines of evidence indicate that instruction on the relationship between spellings and sounds may be particularly important. However, it is unclear whether the effectiveness of this form of instruction depends on pre-existing oral language knowledge. To investigate this issue, we developed a series of computational models of reading incorporating orthographic, phonological and semantic processing to simulate both artificial and natural orthographic learning conditions in adults and children. We exposed the models to instruction focused on spelling-sound or spelling-meaning relationships, and tested the influence of the models’ oral language proficiency on the effectiveness of these training regimes. Overall, the simulations indicated that oral language proficiency is a vital foundation for reading acquisition, and may modulate the effectiveness of reading instruction. These results provide a computational basis for the Simple View of Reading, and emphasise the importance of both oral language knowledge and spelling-sound instruction in the initial stages of learning to read.


Introduction
Reading acquisition requires learning to map written forms (orthography) onto representations of sound (phonology) and meaning (semantics).Even for alphabetic orthographies, in which there is a regular or quasi-regular relationship between graphemes and phonemes (Frost, 2012;Plaut, McClelland, Seidenberg, & Patterson, 1996), learning to read is effortful and frequently fraught with difficulties (Seidenberg, 2017).Effective reading instruction is therefore critical to support children to become proficient readers.There has been a vigorous debate over whether initial reading instruction should focus on the relations between print and sound, or on the relationship between print and meaning (Suggate, 2016;Torgerson, Brooks, Gascoine, & Higgins, 2019).The former is typically characterised by phonics-style training, in which children are exposed intensively to the relationship between the sounds of the language (phonemes) and the letters or letter clusters that represent them (graphemes).The latter is often referred to as meaning-focused or whole-word language instruction, where emphasis is placed on learning the meanings of printed words (Levy & Lysynchuk, 1997).
Proponents of the phonics method (e.g., Bus & van Ijzendoorn, 1999;Ehri, Nunes, Stahl, & Willows, 2001;Ehri et al., 2001) argue that reading instruction should focus on learning spelling-to-sound mappings because exploiting the systematicity of alphabetic writing systems ought to be substantially easier than acquiring more arbitrary spelling-to-meaning mappings.In alphabetic writing systems, spelling-to-meaning mappings can usually only be accomplished word by word (at least for monomorphemic words), without the benefit of generalising from one learned word to the next.Substantial evidence indicates that children's phonological decoding skills are key predictors of reading acquisition (Adlof, Catts, & Little, 2006;Foorman, Herrera, Petscher, Mitchell, & Truckenmiller, 2015;Storch & Whitehurst, 2002; also see Castles, Rastle, & Nation, 2018, for a review).
Alternatively, advocates of meaning-focused methods (Clay, 2001;Davis, 2013;Fountas & Pinnell, 1996;Goodman & Goodman, 2009) argue that the primary goal of reading is to access the meanings of words and so this ought to be the priority of instructional approaches.Although spelling-to-meaning mappings are hard to learn, they may still be acquired early in reading development (Levy & Lysynchuk, 1997;Nation, 2009;Taylor, Duff, Woollams, Monaghan, & Ricketts, 2015) and may be amenable to instruction (Suggate, 2016).For example, Nation and Cocksey (2009) demonstrated that 7-year-old children could access semantic categories of words from orthography very quickly without evidence that the phonological form of the words mediated children's responses.
Recent work has contrasted the effectiveness of sound-focused and meaning-focused training in a laboratory model of reading acquisition (Taylor, Davis, & Rastle, 2017).These authors trained literate adult participants to read two sets of 24 novel words which were written in two different unfamiliar alphabetic orthographies (in each orthography, one character related to one phoneme) and compared reading acquisition when training was biased toward orthography-to-semantic (OS) mappings versus orthography-tophonology (OP) mappings.Examples from one of the artificial orthographies are provided in Fig. 1.Each novel word was assigned a familiar concrete noun meaning (e.g., /gɛd/ referred to camel, and /kɛs/ referred to parsnip), and the mappings between novel words and their referents were counterbalanced across participants).
Prior to reading training, participants were exposed to the mappings between phonology and semantics for the novel words.Then, participants learned OP and OS mappings for both orthographies.For one orthography, participants received OP focused training, which involved three times as many OP training trials as OS training trials, whereas for the other orthography they received OS focused training, which involved three times as many OS as OP training trials.The results demonstrated that OP focused training led to better accuracy and speed in reading aloud, and it also had a transferable benefit to written word comprehension.By contrast, OS focused training resulted in faster but not more accurate written word comprehension, and showed no transferable benefit for the reading aloud task.

Theoretical frameworks for reading instruction
Determining the impact of reading instruction requires a theoretical framework for how reading proceeds.According to the Simple View of Reading (SVR) (Gough & Tunmer, 1986), reading comprehension is a consequence of phonological decoding and oral language skills.During reading training, learners acquire mappings from print to sound, and access meaning based on pre-existing oral language knowledge.There is evidence that both print-to-sound mapping skills (as indexed by pseudoword reading tasks) as well as sound-to-meaning mapping skills (as reflected in oral vocabulary tasks) are predictors of silent reading comprehension performance (e.g., Curtis, 1980;Nation & Snowling, 2004;Hjetland et al., 2019;Ouellette & Beers, 2010;Ricketts, Nation, & Bishop, 2007).For instance, a recent study by Hjetland et al. (2019) has demonstrated that virtually all of the variation in reading ability at the age of seven is due to oral language plus decoding skills, thus supporting the distinction proposed in the SVR (see also Lervåg, Hulme, & Melby-Lervåg, 2018).However, the SVR is underspecified; it is not an implemented processing model (Castles et al., 2018;Chang & Monaghan, 2019;Nation, 2019), and the theory does not even commit to whether decoding reflects sublexical (letter-to-sound) or lexical (whole-word) knowledge.
The triangle model of reading is more fully specified, characterising the representations involved in reading, the pathways between representations, and their varying roles in word comprehension and word naming (Harm & Seidenberg, 2004;Plaut et al., 1996;Seidenberg & McClelland, 1989).In the triangle model, learning to acquire the meaning of written forms of words can be achieved either indirectly, from print to sound and then to meaning, or directly from print to meaning, or via a combination of these routes.Similarly, learning to pronounce a written word can be accomplished via a combination of print-to-sound and print-tomeaning-to-sound mappings.
In an implementation of the triangle model, Harm and Seidenberg (2004) demonstrated the cooperative and competitive nature of print-to-meaning and print-to-sound-to-meaning pathways for written word comprehension, with the sound mediated pathway contributing earlier in learning to read, and the print-to-meaning pathway playing an increasingly important role later in reading acquisition.This pattern over the time-course of learning reflects the greater difficulty in acquiring arbitrary mappings between written and meaning forms, than the more systematic, componential mappings that exist between written and spoken forms in alphabetic orthographies (Plaut et al., 1996).These modelling results suggest that focusing on written to spoken forms early in training ought to be more effective for children's early acquisition of reading.
However, a recent implementation of the triangle model of reading developed by Chang and Monaghan (2019) demonstrated that the involvement of the print-to-sound-to-meaning pathway for written word comprehension was heavily reliant on the proficiency of sound-to-meaning mappings in the model, consistent with the SVR.This is because decoding the phonology of a written word cannot then be mapped onto the word's meaning if the model has no prior knowledge of the mapping between sound and meaning for this word.Poor oral language skills therefore mean that decoding the phonology of a word ends in a cul de sac with respect to meaning.Simulations conducted by Chang and Monaghan (2019) went on to show that reading aloud is jointly influenced by the print-to-sound pathway and the print-to-meaning-to-sound pathway, and as such is also influenced by oral language skills (specifically, the quality of meaning-sound mappings).These insights have profound implications for the extent to which different forms of reading training may be successful in supporting children's early literacy.A key, as yet untested, prediction of the triangle model of reading is that the success of different reading training methods may be modulated by oral language proficiency.Taylor et al. (2017) demonstrated that focus on written-to-spoken mappings during training improved both reading aloud and reading comprehension.They argued that these findings suggest that phonics-based training should be most effective for supporting these components of reading in children learning to read for the first time.However, there are three issues raised by this study that require further exploration in terms of determining the effectiveness of reading training regimes.
First, a key aspect of Taylor et al. (2017) study design was that adult participants were pre-trained on mappings between phonological and semantic forms for the novel words that were to be learned.This mimicked the fact that children have some oral language knowledge prior to reading.However, the division of labour results in the reading architecture models of Harm and Seidenberg (2004) and Chang and Monaghan (2019) predict that oral language skills should have a profound influence on learning to read and the potential effectiveness of different methods of reading instruction.According to the triangle model of reading, a previously tuned spoken-to-meaning system is likely crucial to allow the transference of knowledge from training on written-to-spoken mappings to access meaning from print.Thus, it is possible that phonics instruction will be most successful only if the learner has previously acquired an effective level of oral language knowledge.As yet, the effectiveness of different reading regimes relating to oral language skills has not been tested in an implemented computational model of reading.
Second, unlike children learning to read for the first time, participants in Taylor et al. (2017) study were acquiring a second orthography, which to a certain extent piggy-backs on the reading system that the participants already have in English.Previous studies in second language learning have demonstrated that reading words in a second language automatically activates their lexical representations in the first language (Thierry & Wu, 2007;Wu & Thierry, 2010).Thus, an outstanding question is the extent to which prior language skills, including oral language, print-sound, and print-meaning knowledge, have influences on the observed differences in the artificial orthography study of sound-focused versus meaning-focused reading training.It is possible that acquiring an additional orthography could interfere or benefit from transference from a first orthography.
Third, the orthography used for the artificial words in Taylor et al. (2017) were entirely consistent in terms of written-to-sound mappings between letters and phonemes.However, the English orthography is composed of both consistent (e.g., beg, leg, peg) and inconsistent (e.g., hint, tint, pint; pant, rant, want) mappings between written and spoken forms.Furthermore, it also contains polymorphemic words (e.g., asked, asking, asks; refry, rebook, return) that contain regularities in written-to-meaning mappings.The extent to which the controlled training study of Taylor et al. (2017) applies to a larger, more complex, naturalistic orthographic system, such as children learning to read in English, remains an open question.

The present study
We constructed a series of computational models of reading to investigate the role of different training regimes on learning to read, and to determine how varying levels of pre-existing oral language skills influence the effectiveness of these training regimes.We selected the triangle modelling framework to investigate these issues.This framework is suitable because the use of different reading pathways emerges in response to its exposure to the language environment, rather than being pre-specified or hard-wired as it is in other modelling approaches (e.g., Coltheart, Rastle, Perry, Langdon, & Ziegler, 2001;Perry, Ziegler, & Zorzi, 2010).Furthermore, the model encapsulates multiple pathways to reading, allowing for the investigation of division of labour in the reading system (Chang, Welbourne, & Lee, 2016;Harm & Seidenberg, 2004;Plaut et al., 1996).This flexibility in learning mappings between representations via multiple pathways means that the triangle framework is particularly suitable for investigation of differences in reading instruction on processes involved in learning to read.Thus we constructed three different simulations using the triangle model, each with a different degree of pre-literacy oral language skills present within the model prior to learning to read under different training conditions.In particular, the first two simulations used the controlled conditions of the Taylor et al. (2017) artificial orthographic learning studies, examining how the triangle model performed under a written-to-spoken focused versus a written-to-meaning focused training regime and with different levels of pre-existing oral language skills.These simulations had the advantage of relating detailed behavioral data from Taylor et al. (2017) training studies to the model's behaviour.
Specifically, Simulation 1 tested whether the advantage for the written-to-spoken focused training demonstrated in Taylor et al. (2017) study was present even for the model with poor oral language skills, or whether this benefit was observed only when wellestablished mappings between spoken and meaning representations were in place.Tracking the relative benefit of written-to-spoken and written-to-meaning focused training as a function of pre-literate oral language skills enables greater clarity on how different reading training schemes may benefit readers with varying language abilities.Simulation 2 tested whether the results from the first Simulation are reproduced when the model acquires a second orthography.Whereas Simulation 1 trained the model to learn artificial words without any prior orthographic training, akin to children's learning, Simulation 2 trained the model to learn artificial words with prior knowledge of English, akin to literate adults' learning.This allowed us to investigate the extent to which existing knowledge of an alphabetic orthography impacts on learning to read a new alphabetic script, enabling a test of the validity of adult-learning studies as an inquiry into reading acquisition.This has implications for the extent to which Taylor et al. (2017) laboratory-based study is a valid model of children's learning.
Lastly, to investigate the extent to which the training effects could scale up to a larger, more representative vocabulary of English, Simulation 3 tested whether pre-existing oral language knowledge has a similar impact on the relative benefits of different forms of reading training when the model was trained with a large set of English words with more variation in word properties.This simulation thus provided a model that more closely explored the conditions of reading instruction for children's literacy acquisition.

Simulation 1: Learning to read an artificial orthography
A fully implemented triangle model of learning to read was developed (Chang & Monaghan, 2019;Chang, Monaghan, & Welbourne , 2019;Harm & Seidenberg, 2004;Monaghan et al., 2017).The model learned to map between representations of orthography, phonology, and semantics of words.The model was first trained to different degrees of proficiency in mapping between phonological and semantic representations of words, to simulate pre-literate oral language skills.We tested three different quantities of pretraining to reflect a model with low, medium, and high levels of oral language skill.Oral language skill was conceptualised as the fidelity of phonological and semantic representations within the model, measured in terms of the proportion of words in the language for which the model was able to generate the correct semantic and phonological representations.We then compared the effects of two reading training regimes -orthography to phonology (OP) focused training or orthography to semantics (OS) focused training -on each of the models.The OP focused training model received three times as much training on the OP mappings, while the OS focused training model received three times as much training on the OS mappings.These model training regimes mimicked those used by Taylor et al. (2017).One difference is that the simulation used a between-model design whereas Taylor et al.'s study used a within-subject design, where the participants learned two artificial scripts at the same time.However, the model's performance is less prone to individual variation and so the precise effects of different training regimes can be investigated independently in the modelling work.We evaluated the model's performance under these different training regimes in terms of accuracy of reading aloud and written word comprehension.

Training corpus: artificial words
The training corpus comprised 24 artificial words, taken from the materials in Taylor et al. (2017).For the phonological forms, all items were monosyllabic consonant-vowel-consonant pseudowords and were constructed from 12 consonants (/m/, /t/, /g/, /b/, /k/ , /d/, /n/, /s/, /z/, /v/, /p/, and, /f/) and four vowel phonemes (/ɛ/, /aɪ/, /əʊ/ /ʌ/).Within this set of artificial words, each consonant occurred twice in onset position and twice in coda position, and each vowel occurred six times (Taylor et al., 2017).When considering slot-based coding, 28% of the artificial word pairs shared one letter and 1% shared two letters.This is similar to the letter distribution of the 1737 regular words that children are exposed to in their first year of school: 22% of these word pairs share one letter, 3% share two letters, 0.4% share three letters and 0.004% share four letters (figures taken from the children's printed word database, CPWD; Masterson, Stuart, Dixon, & Lovejoy, 2019).Note that the use of slot-based coding here illustrates that letter distribution over the artificial words is reasonably close to that in children's early vocabulary.However, slot-based coding does not take into account the similarity of letters occurring in different positions; if considering these other forms of similarity, the overlap for both the model and children's printed word reading would be higher.
For phonology, each word was represented in the 3rd, 4th and 5th slots of a set of eight phoneme slots, with each slot consisting of 25 phonological features (including, for instance, voice, nasal, labial, palatal, round, etc.).The number of active phonological features ranged from 7 to 12 (M = 9.25 and SD = 1.42).Each word was positioned with its vowel at the fourth phoneme slot.The first three slots were for onset consonants, and the last four slots were for coda consonants, but because all words in this set had just one onset and one coda consonant, only one of these slots was used during training (so for the word "tep" its phonology was represented as _ _ t ɛ p _ _ _, where _ indicates an empty slot).
For orthographic forms in the artificial language, the correspondence between letters and phonemes was transparent (i.e., there was a one-to-one correspondence).For orthography, each word was represented across a layer containing 14 letter slots with each slot comprising 26 units, each of which could represent a distinct letter, so an alphabet up to 26 letters could be represented.Words were positioned with their vowel aligned on the fifth slot.Consonants preceding the vowel were positioned in slots right before the vowel and consonants following the vowel were positioned starting from the seventh slot.This representation is the same as in Chang and Monaghan (2019), which enabled words up to 14 letters to be represented.However, because all words in this simulation of artificial orthographic learning were three letters in length, with one onset and one coda consonant, words occupied only the 4th, 5th, and 7th slots (so for the word "tep" its orthography was represented as _ _ _ t e _ p _ _ _ _ _ _ _).Note that we use here Roman alphabet characters as a short hand to reflect the alphabet used in the laboratory-based study.However, there is nothing particular in the representations used in the model regarding the particular alphabet used, only that the model is able to distinguish the letters from one another from the outset.
For semantics, a set of familiar objects consisting of six fruits and vegetables, six vehicles, six animals, and six tools were randomly assigned to the 24 artificial words, as in Taylor et al. (2017).The semantic representation for each word was derived from Wordnet (Miller, Beckwith, Fellbaum, Gross, & Miller, 1990), following Harm and Seidenberg (2004).Each semantic representation was composed of 2446 semantic features.The presence of a semantic feature in the meaning representation of a word was encoded as one and the absence of semantic features was encoded as zero in the respective slot in the semantic layer.The number of active semantic features ranged from 3 to 56 (M = 19.13 and SD = 12.9).
As the artificial word vocabulary was small, a potential issue was whether the phonological and semantic representations of this set of artificial words would reflect the relative ease of mappings in English, with OP being more systematic than OS.We investigated the issue by calculating distance scores between each pair of the 24 orthographic representations except self-pairs and applied the same procedure to the phonological and semantic representations.We then correlated the pairwise distance scores of orthography with phonology and orthography with semantics, as a measure of the degree to which similarity in one domain is systematically related to similarity in the other domain.The results showed that the correlation of distance scores between orthographic and phonological representations was 0.67 (p < .001)while the correlation between orthographic and semantic representations was 0.05 (p = 0.377).These data indicate greater systematicity for the OP than the OS mappings, demonstrating that the artificial word representations adequately captured the distinction between quasi-systematic (OP) and more arbitrary (OS) mappings in English.

Model architecture
The architecture of the model is shown in Fig. 2, and is the same as the developmental model of reading implemented in Chang and Monaghan (2019).The model consisted of three key processing layers representing orthographic, phonological and semantic representations, and four hidden layers that learned to map between the processing layers.An attractor layer, which contained 50 hidden units, was connected to and from the phonological layers.Similarly, there was a set of 50 attractor units for the semantic layer.The use of attractors was to help the model to develop stable phonological and semantic representations of words.The semantic layer was connected to the phonological layer through a set of 300 hidden units, and the phonological layer was connected back to the semantic layer through another set of 300 hidden units.The orthographic layer was connected to both the phonological and semantic layers through different sets of 500 hidden units.Fig. 2 also illustrates a semantic context layer, which was not operational in this Simulation, but was used for the larger vocabularies in Simulations 2 and 3 (see Section 3.2.1 for more details).

Training procedure
The training process had two phases: oral language training and reading training.The model was trained on the 24 artificial words with a learning rate of 0.05 using a back-propagation through time (BPTT) algorithm (Pearlmutter, 1989(Pearlmutter, , 1995;;Plaut et al., 1996), in which error gradients were integrated up over time based on time-average inputs.In the model, the continuous time was approximated by discrete time steps with an integration constant of 0.33 (Harm & Seidenberg, 2004).A sigmoid function was used as the activation function.The initial weights were randomly set to values between −0.1 and 0.1.The same training procedures applied to both oral language training and literacy training.Five versions of each model were trained with different random initial weights and different random samplings from the words.The model was built using the MikeNet neural network simulator (Harm & Seidenberg, 2004).
For oral language training, the model learned PS mappings as in an oral vocabulary comprehension task, and SP mappings as in a meaning naming task (e.g., picture naming).To investigate how oral language skills affected literacy development, we used three different quantities of oral language training -500, 1000, or 2000 learning trials.For the oral vocabulary comprehension task (PS), the phonological representation of the word was presented at the phonological layer for eight time steps, and the model generated a semantic representation at the semantic layer.The difference between the actual and the target semantic representation was then calculated, and the weights on connections between all the layers were adjusted according to gradient descent backpropagation through time in order to reduce the error.Similarly, for the oral language meaning naming task (SP), the semantic representation was presented at the semantic layer for eight time steps, and the model was required to produce a phonological representation.During oral language training, the model additionally learned to develop stable phonological to phonological (PP) and semantic to semantic (SS) representations, by presenting the phonological or the semantic representation for two time steps, then allowing the model to cycle activation for a further six time steps to reproduce the initial representation.This permitted the model to develop attractor states corresponding to word meanings and pronunciations.During oral language training, these four tasks (PS, SP, PP, and SS) were interleaved, with 40% of trials for the oral vocabulary task, 40% of trials for the meaning naming task, 10% of trials for the phonological attractor and 10% for the semantic attractor.For each trial, a word was randomly selected with replacement.
After oral language training, the weights of connections between the semantic and phonological layers were fixed.The model was then trained to learn to read with different focuses of reading instruction, either with the OP focused or OS focused training.For each reading learning trial, a word was randomly selected and presented at the orthographic layer for 12 time steps.For an OP trial, the model's error at the phonological layer at the final time step was computed and then backpropagation with gradient descent adjusted the weights to reduce this error, the model thus received feedback on its production of the phonology from the orthography.For an OS trial, error was propagated from the semantic representation, and so the model had feedback on it semantic production from orthographic input.For the OP focused training model, there were three OP trials for every OS trial, and for the OS focused model, there were three OS trials to every OP trial.Each model was trained for 1000 reading trials.

Testing procedure
Following previous simulation work (Chang, Welbourne & Monaghan, 2019;Harm & Seidenberg, 2004), the nearest neighbour measure was used to assess the phonological and semantic representations that the model developed.For testing the model's phonological output, we determined the number of words for which all phonemes were correctly produced.The closest phoneme representation measured in terms of Euclidean distance from the set of all phonemes in the language was derived from the model's actual production, and this was then compared against the target phoneme.If the actual and target phonemes were the same, then the model was judged to have spoken the word correctly.For testing the model's semantic output, the activation of units at the semantic layer was recorded.Accuracy was measured by computing the Euclidean distance between the model's actual semantic representation and the semantic representation of each word in the training corpus.If the smallest distance was for the target representation then the model was judged to be correct.

Results
For the oral language tasks, the model trained with 500, 1000, and 2000 presentations of stimulus achieved 75%, 90%, and 100% accuracy on the meaning naming (SP) task and 46.7%, 76.7% and 97.5% accuracy on the oral vocabulary comprehension (PS) task, respectively.This pattern of results is in line with performance of the model when trained with a substantially larger vocabulary (Chang & Monaghan, 2019;Monaghan, Chang, Welbourne, & Brysbaert, 2017).The three training schedules thus reflect different levels of preliterate oral language skills, from poorer through to near-ceiling vocabulary knowledge.2017) did not directly compare reading aloud and comprehension performance, Fig. 3 suggests that this was not the case in their behavioural data.However, we also assessed model performance using a feature-based approach, which measured whether at least 90% of the target features were correctly activated in the model's actual representation of each word.The results showed that reading aloud and written word comprehension were similar in initial performance, and written word comprehension was even a little harder than reading aloud, consistent with Taylor et al.Importantly, even with this alternative method of assessing performance the key statistical results remained the same.Thus we opted to report the results based on the nearest distance measure, consistent with previous modelling approaches while the results based on the feature-based measure are reported as supplementary materials.
Overall, the model performed better on the tasks for which it had undergone intensive training.For reading aloud, the OP focused training model performed better than the OS focused training model.Adding training focus as a fixed factor resulted in a significant improvement in model fit compared to a model with random effects of item and simulation and with fixed effects of oral language training and reading training stage, χ 2 (1) = 407.65,p < .001.For written word comprehension, the OS focused training model performed better than the OP focused training model, reflected by the addition of training focus improving model fit, χ 2 (1) = 308.82,p < .001.
However, the effect of oral language training had an asymmetric effect on the accuracy of performance on reading aloud and written word comprehension tasks, according to whether the model had been trained with an OP or OS focus.For reading aloud, the effect of different levels of oral language training had no observable effect on performance for both the OP and the OS focused training models.Adding oral language training as a fixed factor did not result in a significant improvement in model fit compared to a model with random effects of item and simulation and with fixed effects of reading training stage and training focus, p = .39.This can be seen in Fig. 3, which shows little difference in the trajectories of accuracy for either the OS or OP focused training models for 500 versus 2000 oral language training.
In contrast, for written word comprehension, the effect of oral language training had a substantial effect: adding oral language training as a fixed factor improved model fit compared to a model with random effects of item and simulation and with fixed effects of reading training stage and training focus, χ 2 (2) = 33.94,p < .001.This effect was most likely generated by the OP focused training model.After substantial oral language training (2000 oral language training epochs, producing close to perfect performance of the oral vocabulary comprehension and meaning naming tasks), the performance of the OP focused training model began to converge with that of the OS focused training model.This observation was confirmed by adding the interaction between oral language training and training focus as a fixed factor, which improved model fit compared to the model containing only random and fixed effects, χ 2 (2) = 10.27,p < .001.In particular, relative to 500 oral language epochs, the performance difference between the OP focused training model and the OS focused training model for 2000 oral language epochs was significantly smaller, β = −0.78,p < .001.For 1000 oral language epochs, the difference was also smaller, but not significantly, β = −0.37,p = .12.The behavioural effects in Taylor et al. (2017) showed that written word comprehension was similarly good following OP focused as OS focused training, and are closely replicated in the model but only when the model has advanced oral skills prior to literacy onset.As in the behavioural study, the performance of the OP and OS focused training models eventually converged near the end of training.

Division of labour
The simulation results suggest that the performance of the OP focused training model on written word comprehension could be enhanced by effective pre-literacy oral language training but the performance of the OS focused training model on reading aloud was not sensitive to these differences in oral language skills.To explore how different reading pathways in the model contribute to reading aloud and written word comprehension, we analysed division of labour between alternative pathways in the triangle model for both the OP focused and OS focused training models using a lesioning technique developed in previous modelling studies (Chang et al., 2016;Welbourne, Woollams, Crisp, & Lambon Ralph, 2011).For reading aloud, to isolate the contribution from the OP pathway, the links between the hidden units and the phonological units in the OSP pathway were entirely lesioned and then the model's performance in producing phonological representations was assessed.The reverse procedure was used to obtain the contribution from the OSP pathway, where the links between the hidden units and the phonological units in the OP pathway were lesioned.Similarly, for written word comprehension, to isolate the contribution from the OS pathway, the model's semantic performance was computed by lesioning the links between the hidden units and the semantic units in the OPS pathway.Again, the reverse procedure was used to obtain the contribution from the OPS pathway.High error scores and low accuracies in the model indicate poor performance when that pathway is lesioned.To include both measures, a composite score was computed by dividing error scores by accuracies.This is equivalent to inverse efficiency (Seidenberg & McClelland, 1989), in which error scores are taken as a proxy for RTs.The reciprocals of the composite scores obtained from the alternative pathways to reading aloud or printed comprehension were then used to determine the division of labour across the pathways.
Fig. 4 shows the patterns of division of labour for both reading aloud and written word comprehension in the OP and OS focused training models with different amounts of oral language training.For reading aloud, the OP pathway was heavily used by the OP focused training model, and the pattern remained similar as oral language training increased.Similar to the OP focused training model, the OS focused training model also used the OP pathway more than the OSP pathway for reading aloud.This may be because the greater systematicity in the OP mappings means that they are easier to learn than the OS mappings in the present artificial word learning (as indicated in the feature based analyses presented in the supplementary information).For reading aloud, it may also reflect the fact that the OP pathway is shorter than the OSP pathway.However, the division of labour in the model was also affected by reading training, because pathway use for the OS focused training was more equally distributed for all oral language training conditions compared with the OP focused training.
For written word comprehension, use of the OPS pathway increased with greater oral language training in the OP focused training model.This highlights why oral language skills influenced the success of OP focused training for written word comprehension: greater oral language proficiency yields a greater contribution from the OPS pathway.A similar but less pronounced pattern was observed for the OS focused training model.Even though the OS focused training model received much more training on the OS mappings, the model showed greater reliance on the OPS pathway than the OS pathway when oral language proficiency was high.This pattern of results suggests that when oral language proficiency is high, OP mappings plus PS mappings are more efficient than OS mappings.However, when oral language proficiency is low, OS focused training means that the model utilises a combination of both pathways to a greater degree, to compensate for its lower ability to generate meaning via phonology.Together, these results demonstrate that the model's learning and use of different pathways for reading aloud and written word comprehension are affected by a combination of the nature of the mappings between representations in terms of their ease of acquisition, pre-existing oral language skills, and reading training regimes.

Simulation 2: Learning to read a second orthography
In Simulation 1, we demonstrated that the effectiveness of OP focused training for developing written word comprehension depended on preliterate oral language skills.However, the model in Simulation 1 was trained on the artificial orthography without any prior experience of reading other orthographies.In contrast, in Taylor et al. (2017), adult participants already had prior experience of English and learned to associate novel phonological and orthographic forms with familiar meanings.This could mean that phonological representations for both English and artificial words are concurrently activated (Thierry & Wu, 2007;Wu & Thierry, 2010) resulting in a bias towards a particular type of training regime that may not necessarily apply when reading is acquired for the first time.Furthermore, learning an additional artificial language requires interpolation of novel word representations into a language system that is already adept at processing natural language.This could result in support as well as interference for acquisition of the additional representations (see Monaghan et al., 2017, for an example of interactivity in learning sequential languages).
In Simulation 2, we first trained the model to learn to read in English to simulate adult participants with fully developed English reading skills as in Taylor et al. (2017).We then trained the model to learn the artificial orthography as in Simulation 1.We predicted that, if the Taylor et al. (2017) paradigm is valid as a reflection of children's acquisition of literacy skills, then the results from Simulation 2 should resemble those of Simulation 1.However, if prior acquisition of reading in English has an influence on performance and the Taylor et al. (2017) paradigm is a study of acquisition of an additional orthographic system, rather than mimicking children's literacy acquisition -then the behaviour of Simulation 2 should diverge in performance from Simulation 1.

Network architecture
The architecture was the same as in Simulation 1.

Representations
The orthographic, phonological and semantic representations of English words were the same as those used in previous simulations using the triangle model (Chang & Monaghan, 2019;Harm & Seidenberg, 2004).The training corpus contained 6229 monosyllabic words, which covered most monosyllabic words in English.Frequency of each word was derived from the Wall Street Journal corpus (Marcus, Marcinkiewicz, & Santorini, 1993), and the frequency value was log-transformed.
For orthography, each word was represented by 14 letter slots and each slot comprised 26 units, one for each 26 alphabetic letter.The vowel of a word was positioned at the fifth slot, and the second vowel was placed at the sixth slot if applicable.Consonants preceding or following the vowel(s) were positioned at the adjacent slots next to the vowel(s).The phonological and semantic representations were the same as in Chang and Monaghan (2019).The semantic context units were used to provide additional information for training the model on comprehending homophones (these were not needed for Simulation 1, because no homophones were included in this Simulation).For homophones belonging to the same homophone group, different context units were active while for non-homophones, none of the context units was active.Each context unit was balanced in its overall activity across the training set (see Chang & Monaghan, 2019, for more details of the context unit implementation).

Training procedures
All the training procedures were identical to those in Simulation 1 except that the training set consisted of 6229 English monosyllabic words, and the context units were also active for learning PS mappings.As for Simulation 1, the training process had two phases.The first was oral language training, where the model was exposed to PS, SP, PP, and SS mappings, and the four tasks were interleaved with 40% of trials for the oral vocabulary comprehension task, 40% of trials for the meaning naming task, 10% of trials for the phonological attractor and the remaining 10% for the semantic attractor.Which word was presented to the model was determined by random sampling according to logarithmic frequency of the word.Error score was based on the cross-entropy error computed between the target and the actual activation of the output units.A learning rate of 0.05 was used.
After oral language training, the weights of the oral language pathways were frozen.The model was trained to learn the mappings from orthography to semantics and from orthography to phonology from the full set of 6229 words in the vocabulary.As for Simulation 1, five models were trained with different random initial weights and different random samplings from the words.

Testing procedures
At the end of oral language training, the model was tested on both the oral vocabulary comprehension and meaning naming tasks.At the end of reading training, the model was tested on both the reading aloud and written word comprehension tasks, exactly as for Simulation 1.

Word reading performance
Based on the training schedule used in Chang and Monaghan (2019), after two million presentations, the oral language training was halted.The model achieved 91.3% correct on the meaning naming task and 90.6% correct on the oral vocabulary comprehension task.For reading training, the model was then trained on one million presentations.At the end of the training, the model was able to produce near perfect performance, which was 99.97% for reading aloud and 99.01% for written word comprehension.These results demonstrated that the model successfully acquired English spoken and written word form representations as in skilled readers.

Representations
We next trained the model to learn the artificial orthography with the fully developed English reading system in place.Similar to Simulation 1, the training corpus comprised 24 artificial words, taken from the materials used in Taylor et al. (2017).Different from Simulation 1, however, the English alphabets could not be used as a short hand to reflect the novel orthographies used in the laboratory-based study because the model possessed knowledge of English.Thus we used 16 novel symbols, 12 for consonants (/!/, /#/, /?/, /$/, /%/, /&/, /(/, /)/, /{/, /}/, /[/, and /]/) and four for vowels (/;/, /:/, / < /, and / > /) to represent the novel orthography used in Taylor et al. (2017).Again, there was nothing particular in the representations used in the model regarding the particular symbols used, only that the model was able to distinguish these symbols from one another and from the English alphabet.All of the representation schemes were the same as in Simulation 1 except that the alphabets were replaced with the symbols for the orthographic representations.For example, for the word "tep" its orthography was represented as _ _ _ t e _ p _ _ _ _ _ _ _ in Simulation 1 while it was represented as _ _ _ #; _ [ _ _ _ _ _ _ _ in Simulation 2. Also, in Simulation 1 each letter slot comprised 26 units, representing each of alphabet letters.Here each novel symbol was represented by a set of random activations of those 26 units.In doing so, each novel symbol was uniquely represented, distinct from each other and each of alphabet letters.
The orthographic input to the model was thus across a set of 16 novel orthography units that had not been previously trained in the model.The phonological representations were also novel, but contained individual phonemes that had occurred during the training on English phonology.For the semantic representations, as in Simulation 1 and Taylor et al. (2017), a set of English word meanings were used and randomly assigned to the artificial words.These were a subset of the 6229 word meanings the model was exposed to in the English training regime.The model's training on the artificial words therefore more closely mimicked the experience of the human participants in Taylor et al. (2017) than in Simulation 1, since it had prior knowledge of the individual phonemes as well as the word meanings.

Training procedures
The training procedure for learning the artificial orthography had two phases: oral language and reading training.During these two phases of training, the model learned the novel artificial words.Meanwhile, the model also continued to be exposed to English words.This was to mimic the laboratory-based situation where participants were trained on the sounds and meanings of the novel artificial orthography, but also continued to use the English language for speaking, comprehension and reading.The details of the training regime can be seen in Table 1.
During oral language training for the words in the artificial language, the model learned to map between phonological and semantic representations for the 24 novel items in the artificial language, alongside maintenance of English oral language tasks and English reading tasks.The training ratios were 10%, 40% and 50% respectively.These ratios were chosen to ensure that the model continued to receive substantial English language input, mimicking the exposure of participants to spoken English in everyday life in between the training sessions in Taylor et al. (2017).After the oral language training, the model was required to learn to read the artificial words with OP focused or OS focused training for 70% of the training trials.That is, the OP focused training model received 52.5% of trials for the OP mappings and 17.5% of trials for the OS mappings.Conversely, the OS focused training model received 17.5% of trials for the OP mappings and 52.5% of trials for the OS mappings.Additionally, the model also continued to maintain its English reading knowledge for the remaining 30% of the learning trials, involving both phonology and semantics produced from orthographic inputs.Note that the training ratios used here were designed to simulate three key training settings in Taylor et al. (2017) study as closely as possible: (1) the participants received much less training sessions for oral language learning compared to reading learning; (2) for the OP focused training, there were three times as many OP trials as there were OS trials; with the opposite pattern for the OS focused training; and (3) the participants received substantial exposure to English in between the training sessions in the artificial orthography learning study.
For artificial word learning, the impact of oral language skills (low, medium, or highly proficient) on reading development was investigated.Three levels of oral language skills were simulated by 4000, 8000 or 15,000 learning trials on the novel PS and SP mappings.The number of learning trials was increased from the previous simulations because the model was trained on both the broader English vocabulary (90% of the trials) and the artificial vocabulary (10% of the trials) at the same time, and this was a computationally more intense task than in Simulation 1, where between 500 and 2000 learning trials were utilised.Due to the greater task complexity, we also employed a small learning rate of 0.01 in order to minimise interference between representations of the English and the novel artificial language (McClelland, McNaughton, & O'Reilly, 1995) and to ensure that the representations of the English vocabulary and the artificial word set could co-exist effectively.All the other training procedures were otherwise identical to those described in Simulation 1.

Testing procedures
All the testing procedures were the same as those in Simulation 1.

Results
For the artificial oral language tasks, the models trained with 4000, 8000, and 15,000 presentations achieved 45.83%, 82.5%, and 95.83% accuracy on the meaning naming (SP) task, and 48.33%, 79.17% and 98.33% accuracy on the oral vocabulary comprehension (PS) task, respectively.The resulting accuracies of the three oral language skill levels were selected to match to those in Simulation 1; however, the training times in Simulation 2 were longer than in Simulation 1 due to the interleaved training between English language tasks and artificial language tasks in order to reach similarly accurate performance on each task.
For the reading training, the average performance on artificial words of the OP and OS focused models with the different levels of oral language skills over the time course of artificial word reading training is shown in Fig. 5. Again, 4000 learning trials were longer than that in the previous simulation because of interleaving training between English and artificial language tasks.As for Simulation 1, we analysed the model's performance on artificial words by using GLMM, with accuracies in reading aloud or written word comprehension as the dependent variables in separate analyses.Item and simulation were included as random factors, and training

Table 1
The training paradigm for learning to read a second orthography: stage 1 for learning English words and stage 2 for learning English words plus artificial words.Both stages have phase 1 and phase 2 learning.For reading aloud, the OP focused training model performed better than the OS focused training model.Adding training focus as a fixed factor resulted in a significant improvement in model fit compared to a model with random effects of item and simulation and with fixed effects of oral language training and reading training stage, χ 2 (1) = 1220.7,p < .001.The effect of different levels of oral language skills was also significant: adding oral language training as a fixed factor resulted in a significant improvement in model fit compared to a model with random effects of item and simulation and with fixed effects of reading training stage and training focus, χ 2 (2) = 12.71, p < .01.However, the interaction between training focus and oral language training was not significant, χ 2 (2) = 1.47, p = .48.
For written word comprehension, the OS focused training model performed better than the OP focused training model.This was confirmed in the GLMM, as adding training focus improved model fit, χ 2 (1) = 453.94,p < .001.The effect of oral language training also had a significant effect, χ 2 (2) = 50.98,p < .001.More importantly, the effect of oral language training had a larger impact on the OP focused model than the OS focused model, χ 2 (2) = 6.96, p = .03.Relative to lower oral language skills, the performance of the OP focused training model and the OS focused training model converged to a greater extent in the medium oral language condition, β = −0.72,p < .001,and the high oral language condition, β = −0.59,p = .056.
These results are largely in accordance with those in Simulation 1.The only exception is that oral language skills had a significant impact on reading aloud performance for both the OP focused and OS focused training models, although note that Fig. 5 shows this effect to be relatively small.The pattern of results from Simulation 1, which mimicked a learner acquiring a novel orthography without previous knowledge of English, were largely replicated in Simulation 2, which mimicked the acquisition of an artificial orthography after pre-training on English.Interference or support effects from first to subsequent language learning were not shown to affect the key observations of the influence of differences in focus during orthographic training, and the interaction with existing oral language skills in the model.The results thus demonstrate that studying training manipulations in a fully trained adult reading system is a valid means of studying mechanisms of reading acquisition in terms of how the different training conditions relate to operation of a computational model of reading.

Division of labour
To further investigate the effect of prior language knowledge on acquisition of a novel orthography, division of labour analyses across the reading pathways were conducted to investigate whether the influence of oral language on use of the reading architecture Fig. 5.The performance of the OP OS focused models on reading aloud and written word comprehension in Simulation 2 with different amounts of oral language training over the time course of the artificial word reading training.The artificial orthographies were learned with the existence of the participants' spoken and written representations in English.might be similar or different between a previously orthographically trained (Simulation 2) versus untrained system (Simulation 1).All the procedures were identical to those described in Simulation 1.
Fig. 6 shows the resulting pattern of division of labour across the pathways for reading aloud and written word comprehension with different levels of oral language.For reading aloud, a very similar pattern was observed for both the OP focused training model and the OS focused training model where the OP pathway was strongly dominant, irrespective of oral language level.Despite more training on the OS mappings, the OS focused training model heavily relied on the OP pathway to support reading aloud.This was different from the pattern in Simulation 1 where the OP and OSP pathways were more equally used for reading aloud.For written word comprehension, oral language skills had a substantial impact on the use of both OS and OPS pathways for both the OP focused and OS focused training models.The use of the OPS pathway increased with the proficiency of oral language skills.This is similar to the pattern observed in Simulation 1 except that for the OS focused training model the OS pathway was more dominant in Simulation 2 than in Simulation 1.
Collectively, the effect of oral language is similar in the trained and untrained systems for both the OP and OS focused training models.However, the pattern of dominance in reading pathways was affected by prior knowledge of English.Specifically, there was an increased use of the OP mappings for reading aloud and the OS mappings for written word comprehension, particularly for the OS focused training model.The results may be explained in the context of second language learning.When L2 words are processed, it has been widely observed that the phonological and/or orthographic representations of L1 translation equivalents are concurrently activated due to the shared concept (Thierry & Wu, 2007;Wu & Thierry, 2010).Although the co-activation of L1 during L2 word recognition has been an important concept within localist models of bilingual word recognition (Dijkstra & van Heuven, 1998, 2002;Dijkstra et al., 2018), models developed based on distributed representations can also simulate cross-language effects of translation and semantic priming (e.g., Zhao & Li, 2012).The notion of the concept being thought of as language independent and based on collections of features has been well rehearsed in the bilingual literature (Brysbaert, Ameel, & Storms, 2014;de Groot, 1992;Kroll & de Groot, 1997).On this account, words in L1 and L2 with large meaning overlap share more features relative to words with language-specific meanings.The larger the meaning overlap between L1 and L2 words, the stronger the co-activation expected.
Therefore, in the model, for reading aloud, it is likely that the mappings from semantics to phonology may cause some interference because the meaning of an artificial word is linked not only to its phonological representation but also to the phonological representation of the English word that shares the same meaning.Hence, the OP pathway may be prioritised.Similarly, for written word comprehension, the phonological confusion generated from semantics to phonology may lead to an impediment to the use of the OPS pathway, which was more pronounced for the OS than the OP focused training model because of more intensive training on semantics.Overall, however, in Simulation 2, the OPS pathway was used following both OP and OS focused training and was moderated by the oral language skills, albeit to lesser extent compared to Simulation 1.

Simulation 3: Learning to read a fuller vocabulary
Simulations 1 and 2 reflected the behavioural data from a laboratory-based artificial orthography study that varied the extent to which reading acquisition focused on reading for meaning versus reading aloud.The simulations showed that pre-literacy oral language skills were essential to the observed advantage of focused training seen in these controlled studies of literacy development.In Simulation 3 we extended the literacy training beyond the confines of a laboratory-based study, to test whether the findings generalised to learning to read a fuller vocabulary of English consisting of 6229 monosyllabic words, more closely approximating children's literacy learning.This also provides a test of different training regimes when there are both consistent and inconsistent print-to-sound mappings, which may reduce the dependence on the OP pathway (Harm & Seidenberg, 2004), and also where there are some regularities in the OS pathway, in terms of the presence of morphology (Seidenberg & Gonnerman, 2000).In Simulation 3 we again varied the oral language skills of the model prior to literacy training, and then tested the effect of OP or OS focused training on the model's developing ability to read, both in terms of reading aloud and written word comprehension.

Model architecture
The architecture was the same as in Simulations 1 and 2.

Representations, training and test procedures
All of the representations, training and testing procedures were identical to the English reading training in Simulation 2. The training set consisted of 6229 English monosyllabic words.As in the other Simulations, three levels of preliterate oral language skills, from poorer through to near-ceiling vocabulary knowledge were simulated by training the model with different amounts of exposure: 0.4 million, 1.2 million, and 2 million presentations.After oral language training, the model was trained to read the English words for 1 million presentations, and the model's performance was assessed every 0.1 million presentations to determine how learning progressed.These training trials were derived from our previous computational modelling of reading development (Chang & Monaghan, 2019), which demonstrated the effect of quantity of oral language exposure on learning to read (but which did not examine different reading regimes).The models were trained with two different foci of reading instruction.For the OP focused training model, there were three times as many OP mappings as OS mappings.Conversely, for the OS focused training model, there were three times as many OS mappings as OP mappings.As in previous simulations, five versions of the model were trained with different initial random weights.

Results
For the oral language tasks, the models trained with 0.4 million, 1.2 million, and 2 million presentations achieved 52.5%, 83.2%, and 90.66% accuracy on the meaning naming (SP) task, and 39.68%, 82% and 91.7% accuracy on the oral vocabulary comprehension (PS) task, respectively.
The learning trajectories for the OP and OS focused training models with the different levels of oral language skills are shown in Fig. 7.As can be seen, the model learned to perform the reading aloud task more quickly and accurately than the written word comprehension task, reflecting the greater ease of the quasi-regular OP mappings compared to the largely arbitrary OS mappings in English.Moreover, for reading aloud, the influence of oral language skills seems to be stronger for the OS focused training model than the OP focused training model.For written word comprehension, both the performance of the OP focused training and OS focused training models seem to be greatly moderated by oral language skills.
To confirm these observations, we analysed the model's performance on English words using GLMMs with accuracy in reading aloud or written word comprehension as the dependent variable in two separate analyses.Item and simulation were included as random factors, and oral language training (0.4 million, 1.2 million, and 2 million presentations), training focus (OP or OS), and reading training stage (0.1 million to 1 million presentations in steps of 0.1 million) were included as fixed factors.
For reading aloud, the OP focused training model performed better than the OS focused training model: adding training focus as a fixed factor resulted in a significant improvement in model fit compared to a model with random effects of item and simulation and fixed effects of oral language training and reading training stage, χ 2 (1) = 10,892, p < .001.Adding oral language training as a fixed factor also resulted in a significant improvement in model fit compared to a model with random effects of item and simulation and with fixed effects of reading training stage and training focus, χ 2 (2) = 1372.7,p < .001,demonstrating the positive effect of oral language skills on reading aloud (as in Chang & Monaghan, 2019).However, this was moderated by a significant interaction between training focus and oral language training, χ 2 (2) = 201.96,p < .001.This interaction arose from the fact that the advantage of the OP over the OS focused training model was significantly less pronounced for both the medium, β = -0.31,p < .001,and high, β = -0.34,p < .001,oral language skills relative to low oral language skills.
For written word comprehension, the OS focused training model generally performed better than the OP focused training model, except for the OS focused training model with the low oral language skill early in learning.Adding training focus, χ 2 (1) = 141,757, p < .001,and oral language training, χ 2 (2) = 181,969, p < .001,both improved model fit.Moreover, the effect of oral language training had a larger impact on the OP focused training model than the OS focused training model, with the addition of the interaction to the analysis resulting in a significant improvement in fit, χ 2 (2) = 7,591.4,p < .001.The performance difference between the OP focused training model and the OS focused training model was significantly smaller for both medium, β = −1.06,p < .001,and high, β = −1.26,p < .001,oral language skills relative to the low oral language skills.
The simulation results showed that there was an effect of oral language training on both reading aloud and written word comprehension.But the effect of oral language training appeared to be larger for written word comprehension than reading aloud.This was confirmed by a further analysis that combined the data for both reading aloud and written word comprehension that revealed a significant three-way interaction between training focus, oral language training, and task, χ 2 (2) = 18,470, p < .001,when all of the other random and fixed effects were included.
The results of Simulation 3 were thus largely similar to those in Simulations 1 and 2, demonstrating the effect of oral language skills on reading instruction, but extending these effects to acquisition of a larger vocabulary.The simulation was thus more closely Fig. 7.The accuracy performance of the OP and OS focused models on reading aloud and written word comprehension in Simulation 3 with different amounts of oral language training over the time course of the English reading training.M: 1 aligned to the task facing children acquiring literacy for the first time, and shows that the same principles of learning and transfer largely apply both for the small artificial orthography task and for learning to read a larger set of words in first language literacy development.
Interestingly, however, in Simulation 3, oral language training influenced reading aloud performance and its effect interacted with training focus, with a stronger effect for the OS than the OP focused training model.This moderation of reading aloud by oral language training was not observed in Simulations 1 and 2 for the artificial word learning.The result suggests that for reading aloud, the reliance of the OS focused training model on the mappings from print to meaning and then to sound was dependent on oral language skills to a greater extent for the large English vocabulary compared to the artificial word learning simulations.

Division of labour
To better understand the observed results, division of labour analyses across the reading pathways were conducted.All the procedures were identical to those described in the previous simulations.Fig. 8 shows the division of labour with different training regimes and with different levels of oral language for both reading aloud and written word comprehension.For reading aloud, both the OP focused training model and OS focused training model used the OP pathway much more than the OSP pathway.However, there was a small but gradually increasing use of the OSP pathway with proficiency of oral language skills, particularly for the OS focused training model.This result is in accordance with the effect of oral language skills on the reading aloud performance of the OS focused training model.
It is worth noting that for the reading aloud task, the contribution of the OSP pathway was moderated by oral language skills to a greater extent in this simulation than in Simulations 1 and 2 This may reflect the fact that, whereas for the artificial words the mapping between letters and phonemes was entirely regular (i.e. 1 letter mapped to 1 phoneme), for the words in Simulation 3, 80.1% of words are regular whilst 19.9% are irregular.Thus, while the OP pathway is dominant for reading aloud, for some words, particularly those with irregular spelling-to-sound mappings, use of the OSP pathway is helpful (Plaut et al., 1996).
For written word comprehension, both the OP focused training model and the OS focused training model relied more on the OPS pathway than the OS pathway to access meaning.The use of the OPS pathway was greater for the OP focused training model than the OS focused training model, and for high than low oral language skills, demonstrating once again why oral language skill influences the effectiveness of both training regimes.

General discussion
We developed a fully implemented connectionist model of reading that mapped between orthography, phonology, and semantics to explore the influence of oral language on the effectiveness of different types of reading instruction.The laboratory-based behavioural study conducted by Taylor et al. (2017) indicated that focusing on learning mappings between print and sound resulted in better reading aloud as well as transferring and enabling written word comprehension.In contrast, focusing on learning print to meaning mappings had little advantage for written word comprehension, and resulted in deficiencies in reading aloud.The consequences of this, if they extend to children's learning, are that, given limited instructional time, learning should focus on phonics, rather than on whole-word, meaning-based strategies for reading acquisition.
Simulation 1 trained the model to learn artificial orthographies from scratch without pre-existing knowledge of English, mimicking children's learning of a very small vocabulary.Simulation 2 trained the model to learn artificial orthographies with a fully developed English system, mimicking literate adults' learning of a second orthographic system, as in Taylor et al. (2017).These Simulations replicated the laboratory-based effects of different reading regimes as tested in Taylor et al. (2017).For reading aloud, the OS training focused model showed deficiencies compared to the OP training focused model whereas, for written word comprehension, the deficiencies of the the OP relative to the OS focused model were less pronounced.In these respects, the model replicated the key effects of the advantage of OP focused training for the early stages of reading acquisition.
However, the equivalent performance for OP and OS focused training models in written word comprehension was dependent upon the model's level of oral language training.Only when the model had previously developed high accuracy in its mappings between phonology and semantics was it able to transfer performance from OP training trials to perform well on written word comprehension.Thus, the pattern of performance from the OP training focused model was similar to the behavioural data reported in Taylor et al. (2017) only for the model that was pre-trained to a high level of oral language skills.OP focused training is advantageous only if the reading system is in a position to exploit pre-existing mappings between phonology and semantics to generate a word's meaning from its written form, via phonology.Having a high fidelity representation of phonology from a written word form cannot accurately activate the target semantic representation if the mapping from phonology-to-semantics for the particular word being read is not present.OP focused training, then, is most advantageous for written word comprehension when the learner has good oral language knowledge, consistent with views that promote the critical role of pre-literate oral language skills to support development of reading (Curtis, 1980;Gough & Tunmer, 1986;Nation & Snowling, 2004;Ouellette & Beers, 2010;Ricketts et al., 2007;Taylor et al., 2015).
There was a small divergence between the results of Simulations 1 and 2 and Taylor et al. ( 2017) behavioural results.In Taylor et al. there was no significant advantage of OS focused training over OP focused training for written word comprehension accuracy, except at the very end.In contrast, the simulation results showed there was a small initial advantage of OS focused training over OP focused training for written word comprehension, but the difference converges over the time course of learning.It is likely that the model has a greater capacity for learning arbitrary OS mappings than the participants in the behavioural study.One could perhaps develop a model with a reduced capacity in the system to increase the difficulty of learning between mappings, but we would expect that the effect of oral language skills for the OP focused training remains similar.
Probing the operation of the model in terms of the division of labour analyses in Simulations 1 and 2 enabled us to determine how the model solved the reading tasks, and highlighted the similarities and differences between Simulations 1 and 2. Considering first reading aloud, Simulation 1 demonstrated that the direct OP pathway was used regardless of training focus.This is because the systematic OP mappings are easier to learn compared to the indirect OSP pathway that requires acquisition of two arbitrary mappings (OS and SP).The use of the OP pathway for reading aloud was somewhat increased in Simulation 2 when the model learned artificial words with prior knowledge in English.This may be due to potential phonological interference in the SP mappings, because the meanings of artificial words are shared with some English words but their phonological representations are different (Thierry & Wu, 2007;Wu & Thierry, 2010).The role of interference and transfer between languages, and between orthographic systems, is a key topic for future investigation.
Considering written word comprehension, in the division of labour analyses the indirect OPS pathway was used effectively, and the magnitude of operation of this pathway was moderated by oral language training for both the OP and OS focused training model in Simulation 1.This suggests that the reading system exploits regular OP mappings in conjunction with a previously learned arbitrary PS mapping, due to the direct OS pathway being arbitrary and consequently difficult to learn.A similar pattern was observed in Simulation 2. However, the direct OS pathway was used more in Simulation 2 compared to Simulation 1.It is possible that the phonological interference residing in the SP pathway might have impeded the efficiency of OP mappings to activate high fidelity phonological representations.Collectively, the division of labour results from Simulations 1 and 2 demonstrated that the reading system uses similar pathways to perform the tasks of reading aloud and written word comprehension, with or without prior knowledge of English.But there is potential phonological interference when the model must learn to read novel words that are linked to existing meanings.
Simulation 3 extended the first two Simulations by examining the effect of training focus on the model's ability to learn to read a larger, more representative vocabulary of English, to more closely approximate the conditions of children's literacy learning.The results confirmed the general effectiveness of OP focused training, consistent with the behavioural results reported in Taylor et al. (2017), and that the advantage was modulated by oral language skills as in our Simulations 1 and 2. However, the features of the wider vocabulary have some influence on the division of labour between reading pathways in the model.For instance, in Simulation 3, there was a small but reliable effect of oral language training on reading aloud for the OS focused training model, that was not present in Simulations 1 or 2. This suggests that the OS focused training model also accessed phonology through the semantically mediated pathway (OSP).It is likely that semantic knowledge is particularly helpful for reading aloud words with inconsistent printto-sound mappings (Plaut et al., 1996).However, the reading system can only exploit these OSP mappings if the language system has in place effective oral language skills that permit the transfer between semantic and phonological representations.This interpretation was supported by the division of labour results, demonstrating that use of the indirect OSP pathway for reading aloud increased in the context of pre-acquired oral language skills.
For written word comprehension, a large effect of oral language training was observed for both the OS and OP focused training models, whereas in Simulations 1 and 2 oral language had a greater influence on the OP focused model.The division of labour results showed that, in general, the use of the OPS pathway was more pronounced compared to that in Simulations 1 and 2. These results suggest that the large vocabulary may reduce the model's reliance on the arbitrary mappings between orthography and semantics.However, it is worth noting that unlike the model, children acquiring reading skills in their first language are not required to learn to read the entire vocabulary from the outset; instead, their reading gradually increases in terms of both vocabulary size and the complexity of the OP mappings required.Further simulations that build a more realistic, graded accumulation of reading skills could test the contribution of large versus gradual vocabulary acquisition on the effect of different reading training regimes (Chang, Monaghan, & Welbourne, 2019;Monaghan & Ellis, 2010).
Taken together, the simulation results demonstrate that there are subtle differences in how different reading pathways are used to support learning of artificial words and English words under different training conditions.However, the general patterns observed follow from the influence of systematic versus arbitrary mappings for generating sound and meaning within an alphabetic orthography.These differences in the systematicity of mappings result in greater use of the OP (as compared to OSP) and OPS (as compared to OS) pathways for reading aloud and comprehension, respectively.The simulation results are thus largely compatible with empirical evidence of the benefit of both print to sound decoding skills and oral language skills on reading ability (e.g.Curtis, 1980;Nation & Snowling, 2004;Ouellette & Beers, 2010;Ricketts et al., 2007).
In this study, we have also demonstrated that oral language skills alter the division of labour in the triangle model of reading to modulate the effectiveness of reading instruction.These findings highlight the importance of considering the multiplicity of factors that underlying effective literacy instruction within a dynamic, adaptive and rapidly changing cognitive architecture in the early stages of reading acquisition.Much of the policy discussion relevant to reading instruction has focused on the provision of systematic phonics in the initial stages of learning to read (e.g.Rose, 2006).Systematic phonics instruction is necessary in alphabetic writing systems because knowledge of how graphemes relate to phonemes does not come naturally to most children (see Castles et al., 2018 for discussion).However, psychological research on reading acquisition has long recognized that systematic phonics instruction is just one component of the journey to skilled reading (Castles et al., 2018).Foundational oral language (e.g.Hjetland et al., 2019;Nation & Snowling, 2004), print experience (e.g.Nation, 2017), morphological knowledge (e.g.Rastle, 2019), and higher-level comprehension (e.g.Perfetti & Stafura, 2014) are all building blocks to developing reading expertise (Castles et al., 2018).Our work provides a computational basis for understanding why phonics instruction is so powerful in the initial stages of reading acquisition, and also shows why it is so important that children start reading instruction with foundational oral language skills in place.
Our simulations show that training on spelling-sound mappings enables accurate reading aloud and comprehension, but that the effectiveness of this training hinges on oral language proficiency.These simulations therefore also provide insight into the challenges pupils with poor oral language skills face in learning to read.The results of Simulation 3 (Fig. 7) establish clearly that poor oral language impacts more on reading comprehension than on reading aloud.This pattern replicates observations of reading abilities in children with language impairment (Bishop, McDonald, Bird, & Hayiou-Thomas, 2009) and provides a computational basis for those observations.Our simulations of reading comprehension (Fig. 7) also appear to suggest that training on spelling-sound mappings may actually be harmful for individuals with poor oral language, and that memorizing the meanings of individual written words may prove more effective.Though this is theoretically possible, it is important to remember that children and the model may vary in their capacity for arbitrary learning of the meanings of individual words.Further, our simulations model the consequences of different instructional regimes while assuming that oral language deficits are fixed.In contrast, studies of children in this population reveal that these deficits and their associated reading comprehension weaknesses can be addressed through oral language interventions (Fricke, Bowyer-Crane, Haley, Hulme, & Snowling, 2013;Snowling & Hulme, 2011).
To conclude, in line with the Simple View of Reading and the triangle model of reading (Gough & Tunmer, 1986;Harm & Seidenberg, 2004), our simulation research demonstrates that oral language proficiency is a vital foundation for reading, and may modulate the effectiveness of reading instruction.This research suggests that a strong oral language foundation accompanied by instruction on spelling-sound mappings allows the process of reading acquisition to exploit the characteristics of pathways linking written, spoken, and meaning representations of words in alphabetic writing systems.

Fig. 2 .
Fig. 2. The architecture of the model.

Fig. 3 (
left) shows the average performance of the OP and OS focused models with the different quantities of oral language training at different stages of reading training.Fig. 3 (right) shows the performance of the participants trained with the OP versus OS focused training on each day taken from Taylor et al. (2017) Figs. 3 and 4. 1 We analysed the model's performance using generalised linear mixed-effects models (GLMM) with accuracy in reading aloud or written word comprehension as the dependent variable, depending on the task.Item and simulation were included as random factors, and training focus (OP or OS), reading training stage (epoch 100-1000 in steps of 100) and oral language training (500, 1000, or 2000 epochs) were included as fixed factors.The reading training stage was log transformed prior to the GLMM analyses.prehension in the OS focused training model.Though Taylor et al. (

Fig. 3 .
Fig. 3. Left panel shows the accuracy performance of the OP and OS focused models on reading aloud and written word comprehension in Simulation 1 with different amounts of oral language training over the time course of the artifical word reading training.The artificial orthographies were learned without any prior experience of reading other orthographies.Right panel shows the performance of the participants trained with the OP and OS focus languages on each day from Taylor et al. (2017).

Fig. 4 .
Fig. 4. The patterns of division of labour (DOL) for both (a) reading aloud and (b) written word comprehension in the OP focused training model and OS focused training model with different amounts of oral language training.OP: orthography-to-phonology; OSP: orthography-to-semantics-tophonology; OS: orthography-to-semantics; OPS: orthography-to-phonology-to-semantics.

Fig. 6 .
Fig. 6.The model learned artificial words with prior experience of English.The patterns of division of labour (DOL) for both (a) reading aloud and (b) written word comprehension in the OP focused training model and OS focused training model with amounts of oral language training.OP: orthography-to-phonology; OSP: orthography-to-semantics-to-phonology; OS: orthography-to-semantics; OPS: orthography-to-phonology-tosemantics.

Fig. 8 .
Fig. 8.The model learned to read the entire vocabulary of English.The patterns of division of labour (DOL) for both (a) reading aloud and (b) written word comprehension in the OP focused training model and OS focused training model with different amounts of oral language training.OP: orthography-to-phonology; OSP: orthography-to-semantics-to-phonology; OS: orthography-to-semantics; OPS: orthography-to-phonology-to-semantics.