Literacy and early language development: Insights from computational modelling

Abstract Computational models of reading have tended to focus on the cognitive requirements of mapping among written, spoken, and meaning representations of individual words in adult readers. Consequently, the alignment of these computational models with behavioural studies of reading development has to date been limited. Models of reading have provided us with insights into the architecture of the reading system, and these have recently been extended to investigate literacy development, and the early language skills that influence children’s reading. These models show us: how learning to read builds on early language skills, why various reading interventions might be more or less effective for different children, and how reading develops across different languages and writing systems. Though there is growing alignment between descriptive models of reading behaviour and computational models, there remains a gap, and I lay out the groundwork for how translation may become increasingly effective through future modelling work.

Translating between behavioural and computational models of literacy Literacy is a foundational skill in children's education, and early literacy development has a profound effect on life outcomes (Harold, Acquah, Sellers & Chowdry, 2016;von Hippel, Workman & Downey, 2017).Understanding how children learn to read, and how to best support their literacy development are thus crucially important issues in children's early development.There has been substantial progress made in describing early literacy development in children, and uncovering sets of tasks relating to children's early language and educational development that are predictive of children's reading skills (e.g., Castles, Rastle & Nation, 2018).In particular, there has been growing recognition of the role of children's early language skills on learning to read once children begin formal education (Dickinson, Golinkoff & Hirsh-Pasek, 2010;Duff, Reen, Plunkett & Nation, 2015;Hjetland, Brinchmann, Scherer, Hulme & Melby-Lervåg, 2020;Lee, 2011;Muter, Hulme, Snowling & Stevenson, 2004;Ouellette & Beers, 2010;Ricketts, Nation & Bishop, 2007;Snow, Burns & Griffin, 1998), resulting in increasing interest in incorporating oral language skills into theoretical and computational models of reading development.
Reading is frequently differentiated into reading fluency and reading comprehension (Foorman, Herrera, Petscher, Mitchell & Truckenmiller, 2015;Ouellette, 2006;Ouellette & Beers, 2010).Reading fluency measures the reader's ability to produce spoken forms of words from written forms.Reading comprehension relates to the reader's ability to determine the meaning of text, either in terms of knowing the meaning of individual words or in terms of understanding events described in sentences, or narratives described in paragraphs (Language and Reading Research Consortium, 2015).Though these skills are related, they do show distinct trajectories during literacy development, with fluency and comprehension closely related in early stages of reading, but tending to bifurcate later in reading development, as syntactic and discourse structures tend to become more complex (Language and Reading Research Consortium, 2015;Ouellette, 2006).
The Simple View of Reading (SVR; Gough & Tunmer, 1986) provided a milestone in developing a framework for how early language skills affect reading development.The SVR proposed that reading rests upon two key abilities; first, the ability to decode letters and sets of letters and map them onto speech sounds; and second, oral language comprehension skills.In a large-scale meta-analysis relating oral language to reading development, Hjetland et al. (2020) estimated the extent to which decoding and reading comprehension skills in early years of formal education related to various early language abilities.They found that children's reading fluency was predicted directly by decoding skills, involving children's letter knowledge and abilities to manipulate and recognize phonemes and rhymes, and indirectly by oral language skills involving vocabulary and grammatical skills, which contributed by enhancing children's decoding skills (see Figure 1).Reading comprehension was predicted directly by both oral language skills and by reading fluency, consistent with the SVR framework (see also Catts, Adlof & Weismer, 2006).The meta-analysis also revealed that reading comprehension was influenced more by oral language skills and less by reading fluency skills in more advanced reading stages than earlier stages, where reading fluency had a stronger influence on comprehension.These large-scale behavioural studies provide substantial insight into the kinds of tasks that can predict children's reading development, yet there remains a disconnect between behavioural descriptions of reading and theoretical models that examine the cognitive mechanisms involved in children's learning to map between written and spoken and meaning forms of words.In pursuit of this, computational models of reading have attempted to articulate precisely the processing requirements for converting written into spoken and meaning forms of words, and consequently clarify the cognitive mechanisms required for learning to read.In the next section, I describe the key frameworks for computational models of reading, then I show how these have been extended to take into account children's early language skills as a pre-cursor to literacy development.
Computational models of (mature) reading Computational modelling provides a stringent test of theoretical models of behavioural phenomena.By constructing a model, and simulating behaviour, the adequacy of assumptions about the cognitive mechanisms involved in a cognitive task can be revealed (Sawi & Rueckl, 2019).Models can then be assessed for the extent to which they approximate behaviour, and hence assessed for whether the mechanisms they implement are sufficient to explain human performance.Computational models of reading have tended to cluster around two traditions, each of which has been very productive in determining the task constraints and cognitive mechanisms involved in adult reading skills (see Seidenberg, Farry-Thorn & Zevin, 2022).
The first tradition derived from the connectionist modelling approach, and gave rise to the triangle model of reading (Figure 2).The triangle model approach investigated the computational requirements from mapping written to spoken and meaning forms that were purely consequent on the nature of those mappings themselves.Models are exposed to written forms of words and trained to learn to map onto target spoken and meaning forms of words, by adjusting connection strengths between sets of units representing letters and sets of units representing sounds and meanings.The triangle model was thus a minimally defined architecturethe architecture of the reading system emerges from the computational requirements of forming mappings among representations.
In early manifestations of the triangle model, just written and spoken forms of words were implemented in order to simulate reading fluency (Harm & Seidenberg, 1999;Seidenberg & McClelland, 1989;Zevin & Seidenberg, 2004).However, meaning representations have also been included in fuller implementations of the triangle model (e.g., Chang, Monaghan & Welbourne, 2019;Harm & Seidenberg, 2004;Plaut, McClelland, Seidenberg & Patterson, 1996) which can simulate reading comprehension (of single words).Including meaning representations also opens up the possibility of including the role of oral language skills in reading, by investigating the nature of mappings between spoken and meaning forms of words and their effect on learning from written word forms.
The second influential tradition in computational models of reading is more consistent with a symbolic modelling tradition (Foorman, 1994;Schneider & Graham, 1992), where the architecture and representations that convert written to spoken forms of words are explicitly defined.Key among these models is the dual route cascaded (DRC) model of reading (Coltheart, Rastle, Perry, Langdon & Ziegler, 2001), where written words are pronounced via two routes: one containing lexical representations where whole word written forms are mapped onto whole word spoken forms, and the other, operating simultaneously, via a set of grapheme-to-phoneme correspondence rules where letters and sets of letters are mapped onto phonemes.Cascading, in the title of the model, refers to the contribution of activations from both the lexical route and the grapheme-phoneme correspondence rule route to generate a pronunciation of the word.Unlike the triangle model tradition, mappings are not learned, but are pre-specified and hard-wired into the model.These hard-wired connections are then weighted according to observations about behaviour.Certain whole word forms are programmed to be activated more quickly than others, to reflect behaviour showing particular words being accessed more quickly than others.For instance, high-frequency words have higher activation than low-frequency words in the lexical route, and so high-frequency words are pronounced with more involvement of the lexical route than are low-frequency words, for which the graphemephoneme correspondence rules contribute a larger proportion of the activation.An adaptation of the DRC, the Connectionist Dual Process model (CDPþ, Perry et al., 2007), has included a learning component, where the lexical route is pre-specified as in the DRC model, but the grapheme-to-phoneme correspondence rules can be learned in a connectionist system as a consequence of exposure to written and spoken forms of words.
The key distinctions between the triangle model of reading and symbolic modelling approaches, such as the DRC, are in terms of the constraints that the models apply to the representations involved in reading and the architecture of the reading system.The DRC, for example, involves a "lexicon", where orthographic and phonological representations of words are stored, and a finite set of rules that are defined in terms of how letters and sets of letters are translated into sounds.The DRC also explicitly defines two routes for readingone involving the lexicon, and one involving the conversion rules from letters to sounds.In contrast, the triangle model of reading does not pre-specify the representations for converting written to spoken (and meaning) forms of words, and nor does it prespecify how these representations are used.Instead, the representations that are useful are acquired by the model as a consequence of the model's experience of the language environment.This is not to say that the triangle model does not develop word-level representations for certain words; the triangle model will use single letters, and groups of letters, up to the whole word, according to whether the statistics of those letter clusters are useful for forming mappings between written and spoken forms, and word-level representations are likely to be more useful for high-frequency words.Much of the substantive debate between these approaches is over the extent to which the models can reflect adult reading behaviour (e.g., Coltheart et al., 2001;Harm & Seidenberg, 2004), at the expense of the crucial question about how we can account for children's language and literacy development.
Both the triangle model and the dual route modelling traditions have been extremely successful in providing detailed descriptions of adult reading behaviours, and clarifying the cognitive processes underlying them (Adelman & Brown, 2008), as well as providing explanations for reading impairments such as phonological dyslexia (Harm & Seidenberg, 1999).Computational models of reading have also had success in reading in different languages (Seidenberg, 2011), consequent on different language properties (Ziegler & Goswami, 2005) and different writing systems (Frost, 2012).For instance, Pagliuca and Monaghan (2010) demonstrated that a connectionist model of reading in Italian learned to map orthography to phonology very quickly as it was able to exploit the systematic letter to sound mappings, more quickly than similar models learning to read English.This reflected the fact that children learn reading fluency in Italian more quickly than children learning English (Seymour, Aro & Erskine, 2003).Similarly, the dual route architecture tradition of modelling has been shown to apply to European languages other than English, for instance by encoding a separate set of grapheme to phoneme correspondence rules for German (Ziegler, Perry & Coltheart, 2000).
Developmental studies of children in cultures that are not WEIRD are sparse (Nielsen, Haun, Kärtner & Legare, 2017), and this applies equally to computational models of reading (though see, e.g., Chang, Welbourne & Lee, 2016;Ueno & Lambon Ralph, 2013, for rare exceptions).However, in order to fully understand reading development, it is critical to examine the extant range of literacy systems globallythere are, for instance, approaching 2 billion readers of non-alphabetic writing systems (Smith, Monaghan & Huettig, 2021).Yang, Shu, McCandliss, and Zevin (2013) explored the cognitive consequences of reading in Chinese, where the Chinese logographic writing system has a large number of distinct characters, which correspond to morphemes (Zhou, 1978) but not transparently to sounds (Tong, McBride-Chang, Shu & Wong, 2009).Yang et al. (2013) found that the model of Chinese was able to learn reading fluency and word reading comprehension accurately, but that it acquired the mappings in a different way than a comparable English model, learning more easily the written to meaning mappings than the written to spoken mappings, whereas the opposite pattern was observed in English.In a controlled comparison among several writing systems using the triangle model framework, Smith et al. (2021) showed how the writing system can have a profound impact on the architecture of the reading system, with greater use of written to meaning forms for writing systems that are progressively more opaque (in terms of how systematic are relations between written and spoken forms).The mechanisms involved in the adult reading system may thus vary substantially across different literacy cultures.
These implemented models of reading have thus proven successful in demonstrating how the computational requirements of mapping among representations of wordswritten, spoken, and meaningaffect, and are affected by, the cognitive components involved in reading and reading disorders.As a default these computational models of reading aim to simulate adult reading behaviour (Coltheart et al., 2001;Harm & Seidenberg, 1999;Perry, Ziegler & Zorzi, 2007;Seidenberg & McClelland, 1989)the outcome, rather than the process of learningand so they are not intended to capture reading development.Hence, translating from a model of mappings among  as in the connectionist triangle model shown in Figure 2 to a model of pathways among  in a behavioural model such as in Figure 1 remains an ongoing problem.However, computational models of reading that have attended to language development enable this gap to be narrowed, and these models are described in the next section.

Computational models of reading development
Computational models that have aimed to capture aspects of reading development have tended to take one of two different approaches.The first approach investigates whether there are residual effects of the process of learning to read as exhibited in adult reading behaviour.The second approach recognisesand attempts to simulatethe effect of children's early language experience prior to learning to read, and then examines the model as it incrementally learns to read to determine the impact of this prior language experience on reading development.

Reading development crystallised in adult reading
The age of acquisition (AoA) of a word has an effect on adult reading behaviour: the earlier a word is acquired the more quickly and accurately it is read (Brown & Watson, 1987;Brysbaert, van Wijnendaele & Deyne, 2000).This effect is independent of frequency, such that words with similar frequency (either cumulative frequency or current frequency) but that vary in age of acquisition are responded to differently (Morrison, Hirsh, Chappell & Ellis, 2002).Ellis and Lambon Ralph (2000) constructed a computational model that learned to map between abstract input and output patterns (mimicking written and spoken forms of words, such that there was a degree of systematicity between the input and output representations, reflecting the close but not perfect correspondence between letters and speech sounds in English).The model was a neural network and was trained with backpropagation, such that the connections between units were adjusted to reduce the model's error when producing spoken output from written input.They found that the mappings for patterns to which the model was exposed earlier in its training were learned more accurately than patterns that the model experienced later in its training.This was due to a higher degree of plasticity for the computational system early in its developmentwhen the model's experience is limited it can adjust strengths of connections more effectively to learn earlier mappings, than later when subsequent learning depends on reorganising the connections to adapt to new mappings.Note that this plasticity was not manipulated in the model, but was rather a consequence of the way in which the model adjusted connection strengths between units.Early in training, the adjustments to connection strengths are larger because the model can move more quickly away from random activations of representations in the model, meaning that the model can adapt to stimuli more quickly; later in training, the adjustments to connections are smaller because it is harder to move the model away from activations that are more structured, meaning that adapting to new stimuli is slower (for a fuller explanation of plasticity in neural network models, see Ellis andLambon Ralph, 2000, p.1108).Monaghan and Ellis (2010) tested early plasticity as a potential explanation for AoA effects observed in adults in a computational model of reading in English.They trained the written to spoken part of the triangle model of reading by presenting written words incrementally to the model according to an order that approximated a child learning to read.After learning to read all the words in the language, the AoA effect was apparent in reading fluency: words experienced earlier in training were read more accurately than those experienced later, even when frequency of the word had been taken into account (Zevin & Seidenberg, 2004).Chang et al. (2019) expanded the Monaghan and Ellis (2010) model to the full triangle model of reading.They trained the model to map from written to spoken  meaning representations with words presented incrementally according to age-appropriate reading material from age 5 years and upwards.After the model had learned to read all words in the language, the model demonstrated AoA effects for both reading fluency and reading comprehension tasks, but a substantially larger effect of AoA for reading comprehensionwhere the arbitrary mappings between written to meaning representations were implicated, which reflected human behaviour where AoA effects were larger when word meanings are involved (Brysbaert et al., 2000).These models demonstrate that the individual's personal history of learning to read affects their processing as a mature reader, and that the reader's learning trajectory exerts stronger or weaker effects depending on the task (Taylor, Duff, Woollams, Monaghan & Ricketts, 2015).

Early oral language development
The second approach to computational modelling of literacy is to take into account children's oral language experience and its effect on learning to read.This approach draws the computational approach a step closer to behavioural models of reading, such as that illustrated in Figure 1.Harm and Seidenberg (1999)'s model of reading was initially exposed to spoken representations of English words prior to receiving any orthographic representations.This was to simulate readers having substantial experience of hearing language before learning to read.The model was able to develop phonological "attractors", where knowledge of the structure of spoken words in English was manifested before the model began to learn to read.However, the model did not contain any meaning representations of words, and so prior experience of words' meanings was not incorporated into the model.Furthermore, gradual development of the model was also not included in Harm and Seidenberg's (1999) model, as it experienced all words orally that it would later learn to read, hence, there was no incrementality in the model's design.Chang and Monaghan (2019) included oral language experience as a key principle in determining how preschool language experience influenced later literacy learning in their computational model of reading.In Harm and Seidenberg's (1999) model, oral language experience was implemented in terms of the model developing stable sound representations for wordthus, the model acquired sound to sound mappings.Furthermore, these sound to sound mappings were experienced for all words in the vocabulary from the outset.This is an oversimplification of children's oral language experience in two key ways.First, in early language experience, children listen to words in order to comprehend speech, rather than just learn the sounds of the language, and they also develop skills of speaking generated from an intention to convey a message.Harm and Seidenberg's (1999) model, because it lacked meaning representations, was unable to simulate comprehension or production of spoken words.Second, as noted in the models of AoA described previously, children are not exposed to all their language at once, but rather the experience of vocabulary gradually expands as children develop.
As described above, according to the Simple View of Reading (Gough & Tunmer, 1986), children's early oral language comprehension ability, coupled with decoding ability, is a key contributor to successful reading (Dickinson et al., 2010;Duff et al., 2015;Hjetland et al., 2020;Lee, 2011).However, how to operationalise oral language experience in a computational model is a challenging issue.Children's experience of oral language is enormously variable in quantity (Golinkoff, Hoff, Rowe, Tamis-LeMonda & Hirsh-Pasek, 2019;Hart & Risley, 1995), meaning that some children have substantial training on mappings between words' sounds and meanings, whereas other children have much less opportunity to learn these representations (Anderson, Graham, Prime, Jenkins & Madigan, 2021).In addition to quantity of oral language experience, quality of the experience matters too for children's early language development.Rowe (2012) showed that numerous properties of child-directed speech influenced children's language development, particularly when more complex constructions, involving longer utterances and broader vocabulary, were used.Determining quality is harder to pin down than quantity in terms of oral language input, and Rowe and Snow (2020) reviewed a range of characteristics of children's language experience relating to quality.Among these were linguistic measures (Lieven, 2019) that described the clarity of the input in terms of phonology and syntactic structure, grammatical and discourse complexity, as well as lexical diversity and the sophistication of the vocabulary in terms of range and depth of meanings.In particular, children's exposure to vocabulary range becomes more important with age in promoting children's own vocabulary size (Hsu, Hadley & Rispoli, 2017;Huttenlocher, Waterfall, Vasilyeva, Vevea & Hedges, 2010;Newman, Rowe & Bernstein Ratner, 2016;Pan, Rowe, Singer & Snow, 2005;Weizman & Snow, 2001).
Chang and Monaghan (2019) constructed a computational model of reading in order to examine the role of both quantity and lexical diversity in early language experience and its effect on literacy, but with the added benefit in computational modelling of being able to tightly control and manipulate the relative contributions of quantity and quality.They exposed the triangle model to varying quantities of early oral language experiencewhere the model learned to map between spoken and meaning representations of English words.They also varied the quality of oral language experience, in terms of the range of vocabulary to which the preschool model was exposed.After this preliteracy oral language training, Chang and Monaghan (2019) then presented to the model written representations of words and required the model to learn to pronounce the words (reading fluency task) and identify the meaning of the word (word reading comprehension task).They found that both quantity and quality of oral language experience exerted an effect: the more words heard, and the wider the range of the vocabulary from which those words were sampled, the more quickly and accurately the model learned to read, especially for the word reading comprehension task.This was because the model could use the pretrained spoken to meaning, and meaning to spoken, mappings to assist in increasing the fidelity of the spoken and meaning representations of words (Perfetti & Stafura, 2014), and to support division of labour among the pathways in the model: a model that had a good ability to produce the meaning of a spoken word can then map the written form onto its meaning via the easier to acquire written to spoken pathway in the model.Similarly, but to a lesser degree, the meaning to spoken pathway could also be incorporated into producing the spoken form of written words with support from the harder to learn, but still contributing, written to meaning pathway.
Curiously, however, the model also showed that lots of experience of a limited vocabulary was detrimental to later learning to read.The model performed worst of all in learning to read when it had a lot of oral language experience of a small vocabulary.This was due to plasticity in the model.If the model had learned to limit its understanding to a smaller vocabulary, and its representation of this small vocabulary was then crystallised due to lengthy exposure, then it was harder for the model to reconfigure to read words that were not part of its preliteracy oral language experience.This lack of plasticity was only the case when the model had had a lot of experience of the small vocabulary.A little experience of a small vocabulary rendered a system that still retained plasticity to expand to a broader vocabulary when the opportunity for that enriched reading environment arose.
The computational modelling work therefore showed how quantity and quality of oral language experience affect learning to read.Behavioural studies have established that both quantity and quality contribute positively to learning to read (Rowe & Snow, 2020).The computational modelling shows why there is this benefit: Q of oral language exposure (Golinkoff et al., 2019;Hart & Risley, 1995) enables the young reader to develop good representations of the sound of words, build up a high-fidelity representation of the words' meaning, and effectively map those sounds onto meaning for those words that occur in their experience.Q of oral language exposure, in terms of lexical diversity (Hsu et al., 2017;Huttenlocher et al., 2010;Newman et al., 2016;Pan et al., 2005;Weizman & Snow, 2001), affects the plasticity, and not the fidelity, of the language system that adapts to learning to read.In this case, quality of input generates a system that can represent a larger range of combinations of word sounds and a more sophisticated set of meanings which can then be extended more rapidly as a larger vocabulary is acquired.These modelling results highlight the growing importance of quality as children age: Newman et al. (2016), for instance, found that though quantity was most important for children's very early vocabulary development, quality became more important as children moved into kindergarten age.Chang et al. (2019) extended the simulations from Chang and Monaghan's (2019) model to examine how this preschool oral language experience affected learning to read when reading was simulated with incremental growth of the vocabulary, according to children's gradual expansion of reading material from age 5 upwards.The model showed that division of labour between spoken and meaning representations was critical for solving the cognitive task of learning to map written words onto their spoken and meaning forms.They found greater division of labour between spoken and meaning representations earlier in reading training than later (see also Yang et al., 2013), but also critical was whether words were learned before or after the onset of literacy.Chang et al. (2019) found that words the model did not know prior to learning to read were acquired more slowly, and were read with greater involvement of the written to meaning pathway in the model (so meaning was more involved even in reading fluency tasks).Words which the model knew orally prior to learning to read were read more quickly and tended to have greater involvement of the written to spoken pathway, even for word reading comprehension tasks.Chang et al. (2019) tested this prediction of the model in a reanalysis of adult reading behavioural data, which demonstrated that words are not all read in the same way, and that the reading architecture itself is not a monolithic system but rather adaptively and flexibly applies to different requirements of different words experienced at different life stages.The modelling work therefore predicted that how the person learns a wordwhether from oral experience, or from texthas implications for how that word is represented in the language system, a prediction that was observed in adult behavioural reading data where spoken word forms are less involved during reading for words learned from text.
These triangle models of oral language experience affecting reading development (Chang & Monaghan, 2019;Chang et al., 2019) have further implications for the consequences of reading training for children with different pre-literacy experience.The models predict that children with larger vocabularies will read words more efficiently, decoding from written to spoken forms, and accessing meaning from the spoken form.Children with smaller vocabularieswho acquire more of their vocabulary from reading will acquire words more effortfully, requiring the arbitrary mapping between written and meaning representations to be learned in order to support effective comprehension of these words.Consistent with Chang et al.'s (2019) computational modelling predictions, Siegelman, Rueckl, Steacy, Frost, van den Bunt, Zevin, Seidenberg, Pugh, Compton, and Morris (2020) showed that children with better reading fluency were more influenced by regularities between written and spoken forms, indicating that they tended to rely more on the written to spoken over the written to meaning pathway.
Differences in preschool oral language skills may also have an important role in determining how effective different literacy training schemes might be.If children have good oral language skills, then they will be able to learn to read words (in an alphabetic writing system) for pronunciation and for meaning via the more systematic and easier to acquire written to spoken mapping, with meaning activated partially via the spoken to meaning mappings that were trained pre-literacy.However, children who have poorer oral language skills will be required to acquire the direct written to meaning mappings for word reading comprehension, which are arbitrary and harder to learn.Hence, focusing on the pronunciation of written words when learning to read will be most beneficial when the child already has good oral language skills, otherwise a focus on pronunciation will end in a cul-de-sac in terms of generating meaning: the meaning will not be effectively accessed via the spoken form of the word.Chang, Taylor, Rastle, and Monaghan (2020) applied the triangle model of reading to investigate how alternative methods for literacy training might influence reading development differently according to children's oral language skills.The triangle model of reading in English was trained to learn to read under two training regimes, one that simulated a meaning-focused training, and the other that reflected a sound-focused training.These were distinguished in terms of which of the pathways in the model (written to meaning, or written to spoken) were being trained most often.The model with good preliteracy oral language skills responded well to both forms of training, whereas the model with poor oral language skills performed more poorly on word comprehension after sound-focused compared to meaning-focused training.The reason for this difference in response is the extent to which the model was able to utilise its knowledge of the oral language.When the model had good oral language skills prior to reading training, this meant that its ability to understand spoken words as well as pronounce words with a variety of meanings was effective.Then, regardless of whether the training was meaning-focused or sound-focused, the model was able to generate spoken and meaning forms of those words via the pre-trained meaning to sound pathways in the model.In contrast, when the oral language skills in the model were poor, the pre-learned meaning to sound pathways in the model were less effective, and so a sound-focused training meant that the sounds of words could be produced but the meaning could not be generated effectively.For meaning-focused training, the easier to learn written to sound mappings were still able to be acquired by the model and so there was a smaller impediment than there was for sound-focused training.
The modelling results were consistent with behavioural studies of adults learning to read in a new orthographic system (Taylor, Davis & Rastle, 2017): when adults were trained in learning to read with sound-focused training, there was a detrimental effect on learning the meaning of the written words, but no such impairment when learning to read with meaning-focused training.The adults in this study, however, already had good oral language skills, and so the key insight of the modelling results are how different systems of reading training are likely to affect children varying in their oral language skills.The implications of these simulations are that combining oral language skill development alongside reading training interventions is likely to best support literacy development of children, particularly those with poorer oral language skills.Such results are consistent with behavioural studies that show oral language skills and decoding skills as separable predictors of children's literacy skills (Nation & Snowling, 2004;Ouellette & Beers, 2010;Ricketts et al., 2007), and that poor oral language skills affect reading comprehension more than reading fluency (Bishop, McDonald, Bird & Hayiou-Thomas, 2009).The model results thus support the validity of calls to focus during early literacy on simultaneously improving oral language skills as well as building up skills in translating letters into sounds (Castles et al., 2018;Ricketts et al., 2007), and requests not to neglect the importance of direct written to meaning mappings in understanding reading development (Taylor et al., 2015).

Future directions and challenges
So, how close is the alignment between computational models (e.g., Figure 2) and behavioural studies (e.g., Figure 1) of reading development?There are several ways in which the divide is being narrowed by these developmental computational models of reading.One benefit of the developmental computational models is that they have incorporated oral language skills as well as reading training into their performance, testing explicitly early language skills and their relation to reading development.Thus, individual differences in the model's oral language skills enable the computational models to make hypotheses about why oral language skills predict reading fluency and comprehension, and why links from oral language to comprehension are stronger than those to fluency (see, e.g., Figure 1).The models also provide explanations for how the role of oral language skills may change over time, and why certain reading schemes may be more or less successful for readers with different oral language profiles.However, substantial gaps still remain.There are limitations to extant computational models of reading in terms of how closely they resemble human reading behaviour.With very few exceptions (Ans, Carbonnel & Valdois, 1998;Pagliuca & Monaghan, 2010;Perry, Ziegler & Zorzi, 2010;Rastle & Coltheart, 2000), models have tended to simulate reading only of monosyllables.In English, monosyllables represent 70% of word tokens that children read, but polysyllabic words present crucial (and interesting) challenges both for children learning to segment words into syllables during reading (Duncan & Seymour, 2003;Kearns, 2020;Mousikou, Sadat, Lucas & Rastle, 2017, for a study with adults), and for computational models that have attempted to simulate those behaviours, usually requiring substantial additional machinery compared to monosyllables (e.g., Ans et al., 1998;Perry et al., 2010;Plaut, 1999;Rastle & Coltheart, 2000).
Another lack of resemblance is that models of word reading tend to be just thatmodels of individual, isolated word reading.Reading comprehension, in contrast, tends to be tested with questions about narrative and discourse, rather than merely identifying the meaning of a single word (which often cannot be isolated in its meaning from the rest of the text; Ouellette, 2006).Computational models of reading development at the word level therefore need to become more closely aligned with models of text reading (Reichle, 2021), such that insights at the word level can permeate models that simulate readers' responses to longer texts.
Relatedly, attention to the role of grammatical development in oral language skills and their involvement in literacy development needs to be extended.Computational models of sensitivity to some grammatical features of words (such as derivational and inflectional morphology) have been simulated using similar principles to those applied in models of word reading.Joanisse and Seidenberg (2003), for instance, constructed a connectionist model of sequences of spoken words mapping onto their meanings.They found that an impairment to phonological processing resulted in reduced sensitivity to inflectional morphology, limiting the model's ability to utilise grammatical cues in the language to determine meaning.Extending this to a triangle model of reading, Monaghan and Woollams (2017) found that either a phonological or a comprehension deficit resulted in impairments to sensitivity to inflectional morphology from written words.As language develops, use of grammatical cues to support and constrain learning becomes more important (Gleitman, Cassidy, Nappa, Papafragou & Trueswell, 2005), and boundaries between grammatical and vocabulary knowledge seem to be indistinct at least during children's early language development (Brinchmann, Braeken & Lyster, 2019) and often tend not to be distinguished in behavioural models of pre-literacy language skills influencing reading development (e.g., Hjetland et al., 2020).Implementing how grammatical knowledge impacts literacy development, in addition to vocabulary skills, is another important topic in better mapping computational models of literacy to behavioural studies of children's early language development.
Greater specificity in the spoken and meaning representations used in models of readingand inclusion of more facts about visual processing of stimulican also help to clarify some of the other early pre-literacy language skills that relate to reading skill development.For instance, phonological awareness tasks have been related to fidelity of spoken representations in the triangle model (Harm & Seidenberg, 1999;Smith et al., 2021), and phonics training has been shown to increase the precision of individual phoneme representations further (Harm, McCandliss & Seidenberg, 2003).However, computational models represent sequences of phonemes either categorically (one unit for each phoneme) or in terms of sets of phoneme features, which is an abstraction away from the auditory properties of words.Implementing auditory features may better highlight how phonological processing skills exert an influence on decoding skills, and consequent reading development.Similarly, meaning is represented in these models either categorically, or in terms of semantic features: these are derived from encyclopedic definitions of words or vectors derived from contextual co-occurrences in text (Mandera, Keuleers & Brysbaert, 2017), and links among meanings are seldom included (apart from abstract contextual units (Chang et al., 2019), or via weighted links based on free association norms (Li, Farkas & MacWhinney, 2004;Steyvers & Tenenbaum, 2005).There are also relevant models that more closely describe and simulate how vocabulary develops, which is not due only to frequency in the environment, but also how the new concept links to the knowledge network of previously acquired words (Jiménez & Hills, 2022).These alternative representations of meaning may help further unpack the role of grammatical skills and vocabulary knowledge that are generally elided into a composite measure of oral language in behavioural models of reading (e.g., Duff et al., 2015;Foorman et al., 2015;Hjetland et al., 2020).
Other opportunities for future computational models of reading to enable closer alignment with behavioural studies can expand on the groundwork that has been laid in terms of cross-linguistic and cross-orthography analyses of reading.Implementing different writing systems, and determining how variety in phonological systems across the world's languages affects learning to read, will give us new insights in constraining the cognitive processes that we have inferred as involved in reading (from studies of English and the handful of other languages that have been investigated).These models will also enable us to determine how transfer effects from one language to another (and across writing systems) may affect reading development in children that move across cultures (Melby-Lervåg & Lervåg, 2014).Such models would also require determining how language skills (both oral and orthographic) transfer from one language to another, and these may have surprising and unexpected effects in terms of where advantages of transfer can occur (see, e.g., Monaghan, Chang, Welbourne & Brysbaert, 2017 for transfer from English to Dutch in a model of reading, and Paulesu, Bonandrini, Zapparoli, Rupani, Mapelli, Tassini, Schenone, Bottini, Perry & Zorzi, 2021, for insights into how bilingualism may alter reading behaviour compared to monolingualism).

Conclusion
There is a now long-standing tradition of computational models of reading that have provided numerous insights into the processes involved in adult reading.Recent extensions of these computational models to investigate reading development have opened up possibilities for closer alignment of propositions about cognitive mechanisms involved in reading with behavioural observations of early language skills affecting children's reading development.There are points where divergence remains between computational and behavioural studies, but the consideration of a range of oral language skills and their relation to reading development in the computational models results in widening opportunities for co-creation of new insights into reading acquisition (see, e.g., Ziegler, Perry & Zorzi, 2020).Among these opportunities are: examining the causes of individual differences in learning to read; understanding the key pressure points in learning to read (such as transitions from monosyllabic to polysyllabic reading); predicting the impacts of different interventions on children's learning; and discovering how learning may vary across different languages and writing systems.The gap between behavioural and computational studies can then be narrowed through future behavioural work that specifies more closely the cognitive mechanisms involved in children's language task performance (e.g., Siegelman, Rueckl, van den Bunt, Frijters, Zevin, Lovett, Seidenberg, Pugh & Morris, 2022), together with increasingly detailed computational modelling that gets closer to children's language experience, and the constraints on their perception and production of language.