Infancy and early childhood maturation of neural auditory change detection and its associations to familial dyslexia risk

OBJECTIVE
We investigated early maturation of the infant mismatch response MMR, including mismatch negativity (MMN), positive MMR (P-MMR), and late discriminative negativity (LDN), indexing auditory discrimination abilities, and the influence of familial developmental dyslexia risk.


METHODS
We recorded MMRs to vowel, duration, and frequency deviants in pseudo-words at 0, 6, and 28 months and compared MMRs in subgroups with vs. without dyslexia risk, in a sample over-represented by risk infants.


RESULTS
Neonatal MMN to the duration deviant became larger and earlier by 28 months; MMN was elicited by more deviants only at 28 months. The P-MMR was predominant in infancy; its amplitude increased by 6 and decreased by 28 months; latency decreased with increasing age. An LDN emerged by 6 months and became larger and later by 28 months. Dyslexia risk affected MMRs and their maturation.


CONCLUSIONS
MMRs demonstrate an expected maturational pattern with 2-3 peaks by 28 months. The effects of dyslexia risk are prominent but not always as expected.


SIGNIFICANCE
This large-scale longitudinal study shows MMR maturation with three age groups and three deviants. Results illuminate MMR's relation to the adult responses, and hence their cognitive underpinnings, and help in identifying typical/atypical auditory development in early childhood.


Introduction
The mismatch negativity (MMN) of the auditory event-related potentials (ERPs) provides an attractive means to examine the emerging auditory cognitive functions from birth onwards (e.g. Alho et al., 1990; or even in foetuses, Huotilainen et al., 2005, and infants born preterm, Cheour-luhtanen et al., 1996; for reviews, see Cheour et al., 2000, Cheour, 2007 Abbreviations: MMN, mismatch negativity; MMR, mismatch response; P-MMR, positive mismatch response; ERP, event-related potential; DP, difference positivity; LN, late negativity; Nc, negative component; RON, reorienting negativity; LDN, late discriminative negativity; EEG, electroencephalogram; ICA, independent component analysis; LMMs, linear mixed models; (RM)-ANOVA, (repeated-measures) analysis of variance. et al., 2013). After 30 years of research in the field, the maturation of the MMN during the first years of life, when it is often called the mismatch response (MMR), has still been studied rarely in longitudinal settings. The first aim of the present work is to determine, using a large longitudinal sample, the maturational tract of the MMN together with other change-detection-related ERP responses. For the reader's convenience, as the relationship of the infant and adult ERP components is still not well established, the whole change-related response complex is here referred to as MMR. Better understanding of the maturation of the MMR is urgently needed in order to grasp how it develops towards the components found in adults and what cognitive functions it reflects (for a suggestion, see Kushnerenko et al., 2013). Knowing the typical maturation of the responses and their neurocognitive underpinnings will also promote their use as neural predictive markers for future development and disorders.
The auditory ERPs are sensitive to auditory and speechprocessing deficits in the heritable neurodevelopmental reading deficit developmental dyslexia (Ozernov-Palchik and Gaab, 2016) and comorbid conditions, such as developmental language disorder (Kujala and Leminen, 2017). Dyslexia is a prevalent disorder, understood to stem from a largely auditory system -based deficit in acquiring and adequately processing the native language phonemes (the phonological deficit, Eden et al., 2016;Giraud and Ramus, 2013;Peterson and Pennington, 2015;Vellutino et al., 2004). Accordingly, the promises of the ERPs as indices of auditory processing deficits in dyslexia and related conditions and in predicting future language development and its delays have received interest (e.g. Choudhury and Benasich, 2011). The second aim of the present work is to compare MMRs and their maturation in subgroups with or without a familial risk for dyslexia, in order to reveal neural deficits that may be predictive of future language and reading problems.

MMRs and their maturation in the first years of life
In adults, the MMN is a fronto-central negativity of auditory cortical origin at around 150-250 ms after deviance onset in response to an irregular or unpredicted event in the auditory stream (e.g., Näätänen et al., 2007;Winkler, 2007). Instead of or in addition to the MMN-like negativity, many newborn studies have reported a change-related positivity, often termed MMR or positive-MMR (P-MMR) and interpreted as an ''immature MMN" (e.g. He et al., 2007He et al., , 2009Trainor et al., 2003;Trainor, 2012). As the relationship of these two responses is not known, longitudinal studies are needed to understand how they relate to each other and to the adult MMN.
However, only few studies on MMN/(P-)MMR maturation in the first years of life have been published so far (Č eponienė et al., 2000;Cheng et al., 2013Cheng et al., , 2015Choudhury and Benasich, 2011;Fellman et al., 2004;Kushnerenko et al., 2002;Pihko et al., 1999). Only three of them collected data from more than two follow-up points (ranging from birth to 48 months), and they all presented a tone frequency change in an oddball paradigm (Kushnerenko et al., 2002;Fellman et al., 2004;Choudhury and Benasich, 2011). These studies report an early and a late negativity, both increasing in amplitude with age (MMN and late negativity LN/negative component Nc), and a positivity (P-MMR/difference positivity DP/P3a) between them that decreases after 6 months.
An extensive review suggests that the aforementioned three components would mature into MMN, P3a, and reorienting negativity (RON), respectively (Kushnerenko et al., 2013; see also Gomes et al., 2000;Kushnerenko et al., 2007;Wetzel and Schröger, 2014). The P3a in adults is a positivity at around 300 ms from deviance onset, thought to reflect an involuntary attention switch towards a salient stimulus (e.g. Horváth et al., 2008). Following it, the RON at around 400-600 ms from deviance onset returns the focus of attention back to the task at hand (Horváth et al., 2008). A component described in children (e.g. in 2-3-year-olds, Putkinen et al., 2012) resembling the infant LN/Nc and adult RON is the late discriminative negativity (LDN, Korpilahti et al., 1995). LDN has been suggested to be particularly associated with speech or higher-order processing of stimuli (e.g. Bishop et al., 2011;Kuuluvainen et al., 2016). Therefore, its relationship to adult RON is not straightforward (see e.g. Shestakova et al., 2003). MMN and LDN have been reported as separate components already in newborn infants (e.g., Kushnerenko et al., 2001;Martynova et al., 2003: MMN latency of $150-200 and LDN latency of $350-400 ms to deviants in speech stimuli), although originally LDN was called the late MMN (Korpilahti et al., 1995; for a review, see Cheour et al., 2001). In the present study, the consecutive negative, positive, and negative peaks of the MMR will be referred to as MMN, P-MMR, and LDN, respectively.

MMRs and deficient auditory discrimination skills in dyslexia
Many background factors could potentially affect the MMRs and their maturation, a family history of dyslexia being one of them (Volkmer and Schulte-Körne, 2018). In dyslexia risk infants, both positive and negative MMRs were often found to be diminished or absent to speech sound duration (Leppänen et al., 2002), consonant changes (van Leeuwen et al., 2006(van Leeuwen et al., , 2008, and (complex) tone frequency changes (in infants at risk for dyslexia or developmental language disorder, Benasich et al., 2006;Choudhury and Benasich, 2011). Atypical hemispheric lateralization of the MMRs in dyslexia risk is also typically reported, e.g., group differences only in the left hemisphere (e.g. Benasich et al., 2006;Choudhury and Benasich, 2011;Leppänen et al., 2002), atypically right-lateralized phoneme-MMRs in dyslexia risk infants (van Leeuwen et al., 2008), or less left-lateralized phoneme-MMRs in school-aged children with dyslexia (Maurer et al., 2003). LDN amplitudes in dyslexic children have been diminished in school-age (Halliday et al., 2014;Neuhoff et al., 2012) but enlarged in at-risk children in kindergarten age (Hämäläinen et al., 2015). Enlarged LDNs in kindergarten age could indicate immature processing, as LDN should diminish with maturation in this age group (e.g., Linnavalli et al., 2018).
The auditory stimuli and exact neural measures (e.g. which response) that have been investigated vary markedly between studies. Still, particularly speech-elicited MMRs in young children seem to be quite coherently associated with both dyslexia risk and subsequent reading skills, in line with the phonological deficit theory of dyslexia (Volkmer and Schulte-Körne, 2018). The present study provides a comprehensive report of the associations of MMRs (MMN, P-MMR, and LDN response amplitudes and latencies and hemispheric lateralization) and their early maturation with dyslexia risk. This can further clarify the picture of whether and how the early MMRs can reflect the auditory and speech process-ing deficits associated with or contributing to the development of dyslexia.

Research questions and hypotheses
The present study investigated the morphology and early maturation of the MMR complex to three auditory deviants (vowel duration, syllable frequency, vowel identity) in a pseudo-word from birth to 6 months and up to 28 months in a longitudinal setting. We studied the elicitation of MMN, P-MMR, and LDN in each measurement point, as well as maturational changes in their amplitude, latency, and hemispheric lateralization. In two subsamples, we further analyzed the differences in the MMR responses and their maturation between infants with or without a familial risk for dyslexia.
Based on previous reports, it was hypothesized that a positivenegative-positive pattern (MMN-P-MMR -LDN) would be observable in the deviant-standard subtraction waveform and be statistically significant at least by 28 months. We expected the MMN and LDN amplitudes to become larger with age, and the P-MMR to decrease in amplitude and latency after 6 months. MMN may also decrease in latency. Based on previous results, maturational tracts of the morphology of the infant MMR differ depending on the stimulus, so that larger and more salient deviants are more likely to elicit negative responses from early on, whereas smaller and less salient changes elicit positive responses, if any (Cheng et al., 2013(Cheng et al., , 2015Cheng and Lee, 2018;Kushnerenko et al., 2007;Peter et al., 2016; see also He et al., 2007, Morr et al., 2002, Leppänen et al., 1997. It was therefore expected that some deviants elicit certain responses, e.g., MMN, already at birth, while some others only elicit them later in development. For example, the vowel identity deviant used in the present study may be the least salient of the deviants and therefore not elicit MMNs at birth (see Thiede et al., 2019). Based on the above-reviewed literature, we hypothesized that the MMRs are diminished in amplitude, delayed in latency, or absent in the infants at dyslexia risk, and their maturation might be slower than in the control group. The MMRs were also expected to be atypically right-or bi-lateralized in the dyslexia risk group, or show group differences only in the left hemisphere electrodes.

Participants
A total of 210 infant participants were recruited during pregnancy or around the time of birth via traditional media appearances and social media advertisements, local maternity clinics and wards, and via the website of the DyslexiaBaby study. The participant selection process, exclusions, and final sample sizes of the whole sample in different parts of the follow-up are described in Fig. 1 and background information of the final samples in Table 1. The recruitment during pregnancy was targeted mainly for parents with dyslexia (target-N = 150), but also infants of non-dyslexic parents (target-N = 50) were recruited with the same strategies. In order to be enrolled in the longitudinal study, the infants had to be born healthy, at term (gestational age at least 37 weeks and birth weight at least 2500 g), and with normal hearing. Evoked Oto-Acoustic Emissions (EOAE) were conducted to newborns routinely at the hospital; in two infants of the present sample, this information was missing but hearing was later screened in a maternity clinic and was normal. In addition, Finnish had to be (one of) their native language(s).
Infants included in the dyslexia risk group of the DyslexiaBaby study had one or two biological parents with dyslexia. Parental dyslexia was confirmed either by a recent (within the last five years) diagnostic statement from a health care professional, or by reading test performance in the present study, together with selfreported reading-and writing-related difficulties in childhood. Reading tests were conducted by psychology master students supervised by a licensed psychologist, and consisted of a Finnish standardized test measuring speed and accuracy of oral text, word, and pseudo-word reading, and writing speed (Nevala et al., 2006). Criteria for dyslexia were below-norm performance of at least one standard deviation (SD) in reading or writing speed or accuracy in at least two out of four subtests. However, some parents did not entirely meet the criteria but still reported clear reading and writing difficulties in childhood and dyslexia in biological relatives. They were classified as compensated dyslexics. Infants with a parent who had indications of neurodevelopmental conditions other than dyslexia or developmental language disorder (diagnosed attention deficit disorder or an individualized curriculum in elementary school of the dyslexic parent suggesting broader cognitive deficits), or of a non-heritable cause for the reading deficit (brain trauma in childhood) were not accepted to the DyslexiaBaby study, or if they were already enrolled in the study when these issues came up, they were excluded from the present study ( Fig. 1 ''Parental diagnosis"). Only very severe health conditions of the children that were known to affect their nervous system and language development resulted in exclusion from the present study (e.g. severe dysmorphias, chromosomal abnormalities, Rolandic epilepsy, brain tumors; Fig. 1 ''Child's later diagnosis"). Additionally, in different parts of the longitudinal study, problems related to scheduling the measurement ( Fig. 1 ''Scheduling") or suboptimal data quality ( Fig. 1 ''Poor data quality") resulted in excluding the data of certain infants from that measurement point. Very few families dropped out from the DyslexiaBaby study during the first 28 months (N = 5/210, Fig. 1 ''Family withdrawn", ''Contact lost").
MMR maturation was investigated in the whole sample (Table 1), including all children with usable electroencephalographic (EEG) data from at least one measurement point. The majority of these children were at familial dyslexia risk, and within the at-risk children, the majority participated in a music listening intervention at 0-6 months (with two groups: intervention 1 = int1 and intervention 2 = int2 in Table 1; for a description of the intervention, see Virtala and Partanen, 2018) while some did not (no intervention group = no-int in Table 1). In order to compare the MMRs between infants with or without familial dyslexia risk, two subsamples (dyslexia) risk and control, respectively, were additionally taken from the whole sample (Table 1). The risk group consisted of those infants of the DyslexiaBaby dyslexia risk group, 1) whose parents currently met criteria for dyslexia (N = 34/160 were excluded as their parent was classified as a compensated dyslexic, and in one case, as the parental dyslexia status could not be confirmed as compensated or uncompensated), and 2) who did not participate in the music listening intervention (as that may affect their ERPs; N = 101/160 were excluded). The Dyslex-iaBaby control group infants (con in Table 1) were all included in the control group of the present study. Their parents (or one, if the other parent was not available) had to report neither suspected nor diagnosed dyslexia nor other language-or learning-related disorders.
This study was conducted as part of the DyslexiaBaby longitudinal study, approved by the Ethics Committee for Gynaecology and Obstetrics, Pediatrics and Psychiatry of the Hospital District of Helsinki and Uusimaa. The study is conducted in compliance with the Declaration of Helsinki. At study enrollment in the EEG recording session at 0 months, one or both parents of the infant gave their written informed consent.

Experimental stimuli and paradigm
The experimental stimuli and paradigm have been previously described in Thiede et al. (2019) and Kailaheimo-Lönnqvist et al. (2020). ERPs were recorded to a bi-syllabic pseudo-word /tata/ (original stimulus) and its three variants, which conform to the participants' native language Finnish (Pakarinen et al., 2014;Fig. 2). The original /tata/ stimulus was spoken by a female native Finnish speaker, and uttered naturally with stress on the first syllable. Stimulus duration was 300 ms, $250 ms audible, including a natural ending. The second syllable started at $168 ms, and the second /a/ at $181 ms. In the three variants, the second syllable (1) frequency (fundamental frequency f o lifted from 175 to 225 Hz, 5 semitones), (2) vowel duration (length of second syllable increased from 71 to 158 ms, total length 400 ms of which $327 ms audible), or (3) vowel identity (second syllable replaced with naturally uttered /to/, start time, duration, and f o -level matched to /tata/) was modified by editing the original /tata/ sound file with Adobe Audition (CS6, 5.0, Build 708) and Praat (5.4.01) softwares. Sound intensity level was root-mean-square normalized in the three variants in order to match their average intensity level to that of the original /tata/stimulus. The experimental stimulus sequences additionally included a fifth stimulus type: novel sounds of human (e.g., sigh, cry, laugh) and nonhuman (e.g., telephone ring, electric drill) origin (the data will be reported elsewhere).
The stimuli were presented in a multi-feature paradigm with the original /tata/ stimulus as the repeating standard, the three variants frequency, vowel duration (henceforth: duration), and vowel identity (henceforth: vowel) as occasional deviant stimuli, and the novel stimuli presented rarely. An inter-stimulus-interval of 850-950 ms, alternating in 10-ms steps randomly (Fig. 2), was used. The paradigm was presented in four blocks of 472 stimuli (altogether 1340 standards, 160 each deviant), where the stimuli were presented otherwise ina random order, but so that four standards started each block and a standard always followed a deviant or novel stimulus. At all age stages, given that the infant/child stayed calm, additional shorter paradigms were presented in the EEG recording following this paradigm (will be reported elsewhere).

EEG recordings
All EEG recordings were conducted with a BrainProducts Quick-Amp amplifier (v. 10.08.14; software: BrainVision Recorder 1.20.0801, Brain Products GmbH, Gilching, Germany) at a sampling rate of 500 Hz, low-pass filter of 100 Hz, and high-pass filter of 0 Hz, and an online reference at the average of all electrodes. EEG was recorded with an electrode cap (ActiCap, Brain Products GmbH, Gilching, Germany) with 18 (at 0 months and 6 months) or 32 (at 28 months) electrodes placed according to the extended international 10/20 system (for details, see Fig. 3). The experimental paradigms were presented with Presentation 17.2 Software (Neurobehavioural Systems Ltd., Berkeley, CA, USA) via one (at 0 and 6 months) or two (at 28 months) Genelec speakers with a stimulus intensity of $65 dB (sound pressure level, SPL) at the infant's/child's head. At all age stages, EEG recordings were conducted with identical equipment and protocol, except for agespecific procedures described below. The measurement, with preparations included, took approximately 1-2 hours at all age stages. 0 months.  at the University of Jyväskylä, Finland (N = 20/190). Infants were lying in a hospital crib on their back, with the speaker placed $40 cm from infant's head, and their state was monitored by a trained nurse or research assistant who conducted the recording (marked with button presses on a response box, Cedrus RB844, Cedrus Corporation, California, USA, as 'active sleep', 'quiet sleep', 'awake', or 'intermediate sleep stage', based on Grigg-Damberger et al., 2007). All alertness states are included in the present data. The risk and control groups did not differ in the proportion of different alertness states: 51 vs. 64% of the risk and control groups were awake part of the time, all infants in both groups were in active sleep part of the time, and 78 vs. 67% were in quiet sleep part of the time, with no statistically significant differences between groups in Chi Square tests, p >.20 (data missing from one infant in each group). Active sleep state was the most common in 65 vs. 51 % of the risk and control groups. The background noise of the room was $40 dB (SPL) at the infant's head.
6 months. Recordings were conducted at the same recording sites (Jorvi Hospital N = 76/90, University of Jyväskylä N = 14/90), speaker placement, and background noise intensity Table 1 Background information and sample sizes in the whole sample (in bold) and in the risk (RISK) and control (CON) subgroups at birth (0mo), 6 months (6mo), and 28 months (28 mo): amounts of the DyslexiaBaby dyslexia risk and con(trol) participants; amounts of the DyslexiaBaby intervention (int1, int2), no-int(ervention), and con(trol) group participants; gender distributions; socio-economic status as indicated by amounts of high and low edu(cation); Electroencephalogram (EEG) recording age (in days or months); and birth-related information (unit specified; with standard deviation, SD; in parentheses). Participants in the ''low edu" group had no parents with higher education (tertiary education leading to an academic degree). last Apgar score (5/10 min) 9.5 (0.7) 9.5 (0.6) 9.5 (0.8) Note 1. In two infants in the whole sample (N = 1 from the RISK group), Apgar score was only 6, but the infants were in good health at the 0-mo-EEG (age 13 d). In two infants in the whole sample (N = 1 from the RISK group), the Apgar score was missing, but there was no indication of health issues at time of 0-mo-EEG (7-8 d). Edu information was missing from one infant in the 0mo whole sample. Note 2. Differences in background variables between RISK and CON were analyzed with One-way ANOVA and Chi Square tests. Age at 28 months differed statistically significantly between groups, p =.013 ($10 d). The difference was considered irrelevant (range in the whole sample 27.2-29.2 mo). The amounts of the RISK/CON groups participating in the EEG recording in the Jyväskylä University recording site was 3/8, 2/4, and 1/6 at 0, 6, and 28 mo, respectively. Note 3. Partly overlapping but differently analyzed EEG data of the participants of this study were reported in Thiede et al., 2019 (0mo  /ta-ta/ /ta-ta/ /ta-ta/ /ta-ta/ /ta-ta/ /ta-ta/ /ta-ta/ /ta-ta/ /ta-ta/ /ta-to/ /ta-to/ /ta-/ ta /ta-/ ta /ta-ta:/ *cough* 200 ms 9 0 ± 50 ms 0 separately for the duration deviant, novel stimuli, and the other stimulus types, and for the inter-stimulus interval. Bottom: In the left, stimulus types, their color labels, and their probabilities (in %) in the experimental paradigm. In the right, illustration of the sound waveform of the standard /tata/ stimulus, with time (in seconds, s) in the x-axis and relative sound amplitude in the y-axis. as at 0 months. Infants were awake, sitting in the caretaker's lap with the research assistant or nurse entertaining them silently with making facial expressions, showing toys, mirrors etc.
28 months. Recordings were carried out after a 2-hour language skill assessment typically on a separate day, in a sound-proof, electrically shielded booth in a laboratory at the University of Helsinki (N = 131/146) and in the same laboratory at University of Jyväskylä as at 0 and 6 months (N = 15/146). The children were awake and sitting in a chair placed 160 cm from the speakers. Either a parent or a research assistant accompanied the child in the measurement room, and the child played self-chosen tablet games (during preparations) and watched a self-chosen silenced cartoon DVD (during the recording). The child was asked not to talk or move (when necessary, also during the recording) and not to pay attention to the sounds presented. Time stamps with any talking or moving of the child during the recording were saved in the continuous EEG data and taken into account during manual rejection (see Section 2.4). The child's informed consent was ensured and possible fears regarding the EEG recording minimized by presenting an illustrated ''storybook" leaflet of the recording before they entered the laboratory.

EEG data analysis
Prior to preprocessing, EEG from those stimulus blocks during which the infant/child was crying or voicing loudly most of the time or during which the 6-month-old or 28-month-old accidentally fell asleep, were excluded from the data. Additionally, the data set of one 28-month-old was excluded due to technical issues during the recording. These exclusions are included in Fig. 1 ''poor data quality".
Preprocessing was conducted with Matlab 2017a-2020a (The MathWorks, Inc., USA), with Toolboxes EEGLAB 14.0.0b and 2019_0 (Delorme and Makeig, 2004) and ERPLAB 7.0.0 (Lopez-Calderon and Luck, 2014). EEG was first filtered (0.025-40 Hz band pass) to exclude large low and high frequency artifacts and to allow for visual inspection of ''bad" electrodes. Electrodes with flat or continuously noisy (large-amplitude, high-frequency activity or massive drifting) signal were marked as ''bad". No more than five electrodes (28 %) in the 0-and 6-month-EEG and six electrodes (19 %) in the 28-month-EEG were marked ''bad". If a stimulus block had more flat or noisy electrodes, it was excluded. Bad electrodes that were peripheral (0 and 6 months: Fp1, Fp2, F7, F8, and Oz; 28 months: additionally T7, T8, Po9, Po10, O1, O2, see Fig. 2) were omitted from analysis, while the rest of the electrodes (0 and 6 months: F3, Fz, F4, C3, Cz, C4, P3, Pz, P4; 28 months: additionally FC5, FC1, FC2, FC6, CP5, CP2, CP3, CP6) marked bad were interpolated (max 2 per infant or 3 per child) at a later stage based on the signal in the rest of the valid electrodes. In the 28-month data, parts with clear muscle-related artifacts confirmed both visually in the continuous EEG and from time-stamped recording notes were manually omitted from the data. Eye-movement and heart-beat artifacts visible in the 28-month data (at Fp1 and Fp2 close to eyes and LM and RM at mastoids, respectively) were marked in the data for a later artifact removal stage. In the 0-month and 6-month data, muscle-related, eye-movement, and heart-beat artifacts were not clearly identifiable and therefore not searched.
EEG was then filtered 0.5 high-pass and 25 Hz low-pass, and rereferenced to an average of four electrodes close to the mastoids (LM, RM, P7, and P8). Reference electrodes that were flat or had a signal continuously exceeding ±250 mV were considered broken; in this case, the electrode and its contralateral pair were eliminated, and an average of the remaining reference electrodes was used. If both reference electrodes on one side of the head were considered broken, data of that stimulus block were excluded. Interpolation of non-peripheral bad channels was conducted. Eyemovement and heart-beat artifacts marked in the 28-month data were then corrected for with independent component analysis (ICA). The independent components found with fastica (Hyvärinen, 1999) or, in case it did not converge, runica algorithms in EEGLAB were compared to the artifact in the raw data and its expected scalp distribution to decide whether a component should be removed from the data. Components were not removed if the removal of the component from the data was unsatisfactory based on visual inspection (e.g., artifact was not diminished or algorithm changed other parts of the data).
Continuous EEG was segmented to epochs starting at -100 ms and ending 840 ms after stimulus onset, with baseline correction  One or both parent(s) of the infants/child gave permission for the University of Helsinki to use the photos without the infant's/child's name. Right: Electrode layout at 0 and 6 months (in pink) and 28 months (pink and blue). The ground (black) and ref (green, the active online reference) electrodes and the electrodes used as reference electrodes in re-referencing (turquoise, see Section 2.4) were the same at 0, 6, and 28 months. Peripheral electrodes are indicated by a dashed circle and were not included in statistical analyses due to their overall poor signal quality.
according to the average voltage in the -100-0 ms pre-stimulus interval. Epochs were excluded based on the following criteria to omit eye-movement-related artefacts and slow drifts from the signal: if amplitude exceeded ±120 lV at Fp1 and Fp2 electrodes, and if the epoch had a drift of >100 lV or data points ±3 SD from the mean amplitude of all epochs (jointprob algorithm in EEGLAB, separately for each electrode and averaged across electrodes). The remaining epochs were separated by stimulus type (standard, duration deviant, frequency deviant, vowel deviant), for each stimulus block and electrode. Epochs for standard stimuli immediately following a deviant were not included in the standard epochs. Epochs of all stimulus blocks of the same stimulus type, separate for each participant were then merged, resulting in one dataset per participant and stimulus type. Data of infants/children with less than 30 accepted epochs for more than one of the three deviants were excluded (0 months N = 0, 6 months N = 41, 28 months N = 7; included in Fig. 1 ''poor data quality"). The final whole sample at 0, 6, and 28 months had on average 113 (range: 29-188), 51 (24-94), and 78 (23-137) accepted trials per infant for each deviant, respectively. Subtraction waveforms were calculated separately for the duration, frequency, and vowel deviants by subtracting the standard response from the deviant response and shifting the baseline correction to À100-0 ms from the onset of the deviation, i.e., 125-225 ms from the stimulus onset for the duration subtraction waveform and 80-180 ms for the frequency and vowel subtraction waveforms.

ERP quantification and statistical analysis
Cluster-based mass permutation tests implement in Fieldtrip toolbox (Oostenveld, et al., 2011;Maris and Oostenveld, 2007) were employed to determine spatiotemporal windows of significant deviant-standard differences for each deviant type and measurement time point between deviance onset and epoch end (840 ms). First, time ranges with significant deviant-standard differences (p <.05) with the same polarity in adjacent time points at two or more neighboring channels were determined. Then, the sum of t-values for each such cluster was computed. The test statistic was defined as the maximum of these sum t-values. To determine a null distribution for the test statistic, the stimulus labels (deviant vs. standard) were randomly permuted 5000 times and the test statistic was computed for each iteration. The cluster sum t-values obtained with the true labels were deemed significant if they exceeded the top or bottom 2.5 percentile of the test statistics obtained with the permuted labels. All except the peripheral and reference electrodes were included in the analyses (see Fig. 3). Note, that this approach controls for the Type I error rate.
For quantifying the mean amplitudes and peak latencies of MMN, P-MMR, and LDN responses, individual peak latencies were searched from broad time windows (Table 2) in a large region-of-interest (ROI) of 6 electrodes that seemed most appropriate based on the permutation tests (see Fig. 3: F3, Fz, F4, C3, Cz, and C4 for all responses except for LDN at 6 months, C3, Cz, C4, P3, Pz, P4) with an additional low-pass filter of 10 Hz. Mean amplitudes were calculated from the large ROIs from original-filtered data, from time windows (width $peak latency standard deviation) centered around the individual peak latencies. Four additional electrodes FC1, FC2, FC5, and FC6 were added to the mean amplitude calculation in the 28-month-data in order to improve the signal-to-noise ratio. As the added electrode locations fall between the F-and Crows included in the large ROI (Fig. 3), they were assumed not to markedly affect the response amplitudes or latencies. In order to study the hemispheric distribution of the MMN, P-MMR, and LDN amplitudes, additional left ROI and right ROI mean amplitudes were calculated from the left and right hemisphere electrodes, respectively (all large ROI electrodes except for the midline Fz, Cz, and in 6mo LDNs, Pz). This was done separately for each deviant and measurement point, only for the responses that were statistically significantly elicited in the whole sample based on the abovedescribed mass permutation tests.
For 0mo P-MMRs and 28mo LDNs, peak latencies were searched from a window that ended at the end of epoch. For individuals with peak latencies close to the end of epoch (closer than window width for amplitude calculation divided by two), the latest possible latency window was used for calculating the mean amplitude. For individuals without a peak in the search window (missing values), mean amplitude was calculated from a window centered at the group average peak latency. Their peak latencies were not replaced in the data but were treated as missing values. Percentages of missing peak values in the whole sample were highest for duration-MMNs (54% at 0mo and 25% at 28mo), but otherwise <10% (for 0mo-P3a's 0-1%; for 6mo-P-MMRs 0-1%; for 6mo-LDNs 0-2%; for 28mo-frequency-MMN 2%; for 28mo-P-MMRs 5-7%; 28mo-LDN's 3-8%).
Maturational changes in peak latencies and mean amplitudes in the large ROI were investigated with linear mixed models (LMMs) in R using the lme4 package (Bates, et al., 2007) with age as fixed factor and subject as a random factor. This was done when based on the mass permutation tests, a response was statistically significantly elicited by the same deviant in at least two measurement points in the whole sample. As an exception, maturation of the P-MMR was not statistically analyzed across all the three ages because (1) the change of the response amplitude seemed nonlinear and (2) it seemed particularly unclear whether the components reflected the same neural process across ages (see Results). Separate analyses were conducted for MMN, P-MMR, and LDN, but when the same response was elicited by several deviants in more than one measurement point, deviants were analyzed together, and deviant was included as a fixed factor in the model (main effect of stimulus not reported). In order to investigate Table 2 Search windows used for mean amplitude calculation of the mismatch negativity (MMN), positive mismatch response (P-MMR), and late discriminative negativity (LDN) responses at birth (0mo), 6 months (6mo), and 28 months (28mo). Search windows describe the latency windows (in ms from deviance onset) for searching the individual response peaks. Width (in ms, in brackets) indicates the width of the latency window that was centered at the individual response peaks in order to calculate the individual mean amplitudes. The epoch ends at 615 ms from deviance onset for the duration deviant and at 660 ms from deviance onset for the frequency and vowel deviants. hemispheric differences in mean amplitudes, the LMMs were repeated with the left and right ROIs, with hemisphere included as a fixed factor in the model. Only those main and interaction effects that include the effect of hemisphere are reported for these LMMs. Statistical significance of the MMRs within the risk and control groups was analyzed with One Sample t tests (Bonferronicorrected for multiple comparisons within age group). Group differences were analyzed for all responses that were statistically significantly elicited by at least one deviant and at least one of the two groups based on the One Sample t tests. When a response was statistically significantly elicited by a deviant in at least two age groups, group differences (risk vs. control) in peak latencies and mean amplitudes and their maturation were investigated with LMMs described above, with the effect of group added to the model. Only those main and interaction effects that include the effect of group are reported for these LMMs. When a response was statistically significantly elicited in only one age group, the group comparison was conducted with a (repeated-measures) analysis of variance, (RM)-ANOVA, with stimulus type as a within-subject factor (if several deviants elicited the same response), or a One-way ANOVA (if only one deviant elicited the response). In order to investigate hemispheric differences in mean amplitudes between groups, the LMMs and ANOVAs were repeated with the left and right ROIs, with hemisphere included as a fixed/ within-subject's factor in the model. Only those main and interaction effects that include the effect of hemisphere are reported for these LMMs.

MMRs and their maturation in the whole sample
At birth, duration deviants elicited an MMN followed by a broad P-MMR, and frequency and vowel deviants only elicited a broad P-MMR (deemed significant by the mass permutation tests, Fig. 4). At 6 months, all three deviants elicited a P3a, and duration and frequency deviants elicited an LDN in the centro-parietal electrodes. At 28 months, duration deviants elicited an MMN-P-MMR-LDN complex, frequency deviants elicited an MMN followed by an LDN, and vowel deviants elicited a P-MMR followed by an LDN.
Maturation of the MMRs is illustrated in Fig. 5, mean amplitudes and peak latencies are listed in Table 3, and a summary of the statistically significant effects is provided in Table 4. Complete statistics of the LMMs are provided in the Supplementary Table S1.

Hemispheric distribution of the MMRs and its maturation in the whole sample
When hemispheric lateralization (left vs. right ROI) was added to the LMMs, no significant main or interaction effects of hemi-sphere were found (for duration-and frequency-LDNs at 6 and 28 months, main effect of hemisphere was p =.063 with numerically larger amplitudes on the right than left ROI; for all others p >.10). Other effects were not investigated. The complete results of the LMMs including hemispheric lateralization are listed in the Supplementary Table S2.

MMRs and their maturation in the control and dyslexia risk groups
One Sample t test statistics of group-wise response significance are reported in Table 5 and MMRs in the two groups are illustrated in Fig. 6. At birth, P-MMRs were significant to all three deviants in both groups. Duration-MMN did not reach significance in either group. At 6 months, P-MMRs were significant to all three deviants in both groups, while duration-and frequency-LDNs remained significant after Bonferroni corrections only in the risk group. At 28 months, duration-P-MMR, frequency-MMN, frequency-LDN, vowel-P-MMR, and vowel-LDN were significant in both groups, while duration-LDN remained significant after Bonferroni corrections only in the control group. Duration-MMN did not reach significance in either group. Based on these results, all responses except for the duration-MMN at 0 and 28 months were included in the group comparison LMMs and ANOVAs. A summary of the statistically significant effects is provided in Table 6. Complete statistics of the LMMs are reported in Supplementary Table S3.
The LMM analysis on the maturation of duration-, frequency-, and vowel-P-MMR amplitude at ages 0 and 6 months yielded a Group Â Stimulus interaction, F(2,267) = 5.948, p <.01, which results from a larger duration-P-MMR in the risk vs. control group (p <.05) and a larger vowel-P-MMR in the control vs. risk group (p <.01). The corresponding analysis for response latency revealed a significant Group Â Time interaction, F(1, 334) = 4.850, p <.05, indicating that the age-related reduction in P-MMR latency for duration, frequency, and vowel changes was larger in the risk group than in the control group. The LMM analysis conducted on the duration-and vowel-P-MMR amplitudes at ages 6 and 28 months revealed a Group Â Stimulus interaction, F (1,102) = 8.602, p <.01, which resulted from a larger duration-P-MMR in the risk group (p <.05) and larger vowel-P-MMR in the control group (p <.05). No significant group differences were observed in the corresponding analysis for response latency.
The LMM analysis conducted on the duration-and frequency-LDN amplitudes at ages 6 and 28 months revealed a Group Â Time interaction, F(1,178) = 6.042, p <.05, indicating that the increase in duration and frequency LDN amplitudes between 6 and 28 months was larger in the control than in the risk group, reaching significance only in the control group (p <.001). The corresponding analysis for latency yielded a significant Group Â Time Â Stimulus interaction, F(1,116) = 4.721, p <.05. The post hoc pairwise comparisons showed that in the risk group, the duration-LDN latency did not significantly increase with age, whereas the latency of the frequency-LDN did (p <.05). In contrast, both the duration-and frequency-LDN increased in latency in the control group (both p <.05) between 6 and 28 months. One-way ANOVAs of the frequency-MMN and vowel-LDN mean amplitudes and latencies at 28 months, not included in the LMMs, yielded no statistically significant group differences.

Hemispheric differences between groups
When hemispheric lateralization was added to the LMMs, no statistically significant main or interaction effects of hemisphere on MMR amplitudes or their maturation were found (in all p >.20). Other effects were not investigated. Complete results of the LMMs including hemispheric lateralization are listed in Supplementary Table S2.   A RM-ANOVA of the frequency-MMN mean amplitude at 28 months yielded a statistically significant Hemisphere Â Group interaction, F(1,58) = 5.212, p =.026, ɳ 2 p =.082. This resulted from a statistically significant hemispheric difference in frequency-MMN amplitude in the control group only, p =.048. Numerically, mean amplitudes were larger in the right (À3.828 mV) than left hemisphere (À3.203) in the control group, and larger in the left (À3.567) than right hemisphere (À3.160) in the risk group. A corresponding RM-ANOVA of the vowel-LDN mean amplitude at 28 months yielded no statistically significant main or interaction effects.

Discussion
The present results demonstrate the expected emergence of a positive-negative-positive MMR to three deviants in speech sounds by 28 months of age in the longitudinal DyslexiaBaby sample. A broad positivity, here termed P-MMR, was the most prevalent response at birth and at 6 months in response to all three deviants, and it was elicited by duration and vowel deviants but not by frequency deviants also at 28 months. The P-MMR grew with age by 6 months and then decreased by 28 months, while  its latency decreased throughout the follow-up period, in line with the hypotheses. Only the duration deviant elicited an early negativity, here termed MMN, already at birth. It grew in amplitude and decreased in latency by 28 months, as hypothesized, at which age it was also elicited by frequency but not vowel deviants. The late negative response, here termed LDN, was first seen at 6 months in centro-parietal electrode locations to duration and frequency but not vowel deviants. The LDN was the most prevalent response at 28 months, when it was elicited by all three deviants. The LDN grew in amplitude, as hypothesized, but unexpectedly, it also increased in latency by 28 months.
MMRs that were statistically significant in the whole sample, mostly also reached statistical significance in the control and dyslexia risk groups, with some exceptions in one (particularly 6month-LDNs in the control group) or both groups (duration-MMN at 0 and 28 months). In the dyslexia risk compared to the control group, consistently across the ages, the vowel-P-MMR was diminished, in line with the hypotheses, and the duration-P-MMR was enlarged, against the hypotheses. The P-MMR latency decrease between 0 and 6 months was larger in the dyslexia risk than the control group. The duration-and frequency-LDN amplitude increase from 6 to 28 months seen across the whole sample was larger in the control than the dyslexia risk group, and significant only in the control group. Its latency increase with age was statistically significant in the control group for both deviants, but in the dyslexia risk group, only for the frequency deviant.

MMRs and their maturation in the first years of life
We expected a three-peaked pattern in the infant MMR, consisting of an early negativity that grows with age (MMN), a positivity (P-MMR/DP/P3a) that decreases after 6 months, and a late negativity (LN/Nc/late negativity) that grows with age (longitudinal studies: Choudhury and Benasich, 2011;Fellman et al., 2004;Kushnerenko et al., 2002; cross-sectional evidence: e.g., Shafer et al., 2011;Slugocki and Trainor, 2014). The present findings are well in line with this pattern, validating previous findings with a larger sample size (N = 90-190 depending on the age group, compared to a total N = 12-56 in the three previous longitudinal studies). As response elicitation and maturation seemed to vary Table 3 Mean amplitudes (in mV, with standard deviation, SD, in parentheses), peak latencies (in ms from deviance onset, SD in parentheses), and sample sizes (N) of each peak latency of statistically significant mismatch negativity (MMN), positive mismatch response (P-MMR), and late discriminative negativity (LDN) responses in the whole sample at birth (0mo), 6 months (6mo), and 28 months (28mo) at the large, left, and right regions-of-interest (ROIs).  according to deviant type but most previous studies only report one deviant (see below), reporting MMRs to three auditory deviants increases the value and generalizability of the present findings.
MMN. The MMN was prevalent (significantly elicited by two out of three deviants) in the present dataset only at 28 months, whereas at earlier ages, it was only elicited by the duration deviant at birth (a small non-significant negativity was visible also to the frequency deviant). The absence of an MMN to the other deviants, together with the longer duration of the duration deviant (standard-deviant difference significant already at deviance onset, Fig. 4), suggests that rather than genuine deviance detection, the neonatal ''duration-MMN" may in part or fully reflect a prolonged obligatory ERP to the long duration deviant stimulus (see also Thiede et al., 2019, partly same newborns: using a controlled stan-dard stimulus abolished the neonatal duration-MMN). The absence of an MMN (e.g., Friederici et al., 2002;Shafer et al., 2011 -the obtained late negativity rather resembles the LDN), or the lack of statistical confirmation for its significance (e.g., Cheng and Lee, 2018;Kushnerenko et al., 2001) is a typical finding also in previous studies. Hence, the MMN may not be such an ontogenetically early response as previously argued (see e.g., Cheour et al., 2000). Possibly MMNs are elicited only by sufficiently large deviances in infants and young children, in line with our finding that the MMN was not elicited at 28 months by the presumably least acoustically salient vowel deviant (see below). Alternatively, the absence of an MMN in the present data may be (partly) due to the large proportion of dyslexia risk infants (see below). If the neonatal duration-MMN still reflects some genuine MMN-like processing, the observed maturational pattern of its amplitude grow- Table 5 Mismatch negativity (MMN), positive mismatch response (P-MMR), and late discriminative negativity (LDN) mean amplitudes (mean ampl, in mV, with standard deviation SD in parentheses), peak latencies (in ms from deviance onset, standard deviation in parentheses), and One Sample t statistics of mean amplitudes in the risk (RISK) and control (CON) subgroups at the large regions-of-interest (ROIs). Statistically significant mean amplitudes (after corrections) are in bold. Note 1. Peak latency was searched separately from each individual and therefore the data contains missing values. The column ''peak latency N" lists the sample size available for each response at each age group. Note 2. Mean amplitudes are calculated from the following large ROIs: at 0mo and 6mo, F3, Fz, F4, C3, Cz, and C4, except for the 6mo-LDNs, C3, Cz, C4, P3, Pz, and P4; at 28mo, F3, Fz, F4, C3, Cz, C4, Fc1, Fc2, Fc5, and Fc6. The peak latencies are calculated from the following large ROIs: at 0mo, 6mo, and 28mo, F3, Fz, F4, C3, Cz, and C4, except for the 6mo-LDNs, C3, Cz, C4, P3, Pz, and P4.
ing and the latency decreasing with age is in line with previous studies (Fellman et al., 2004;Slugocki and Trainor, 2014). P-MMR. The P-MMR (also called P3a/ MMR/DP; e.g., Trainor et al., 2001;Kushnerenko et al., 2002;Slugocki and Trainor, 2014) was the most prevalent response across ages in the present data (although, for some reason, not elicited by the frequency deviant at 28 months). It has been the most often reported MMR also in previous infant literature (e.g., Choudhury and Benasich, 2011;Slugocki and Trainor, 2014;Cheng et al., 2015;Háden et al., 2015) and even in older children (6.5 years: Maurer et al., 2003). In line with earlier reports, it decreased in amplitude after 6 months (Choudhury and Benasich, 2011; see also Slugocki and Trainor, 2014;Shafer et al., 2011). The P-MMR grew from 0 to 6 months and was so predominant at 6 months that the MMNlike negativities visible at birth and 28 months to duration and (not significantly to) frequency deviants were not visible at all in the 6-month grand averaged data (Figs. 4, 5). As the MMN and P-MMR did not reliably co-exist in the infant MMR (both statistically significantly elicited only by the duration deviant at birth), it remains rather unclear whether the broad infant-P-MMR seen in the present study represents an ''immature P3a" (as suggested by Kushnerenko et al., 2013) or an ''immature MMN" that would shift Colorful bars illustrate the time windows used for searching the individual peak latencies, and arrows mark the mean peak latencies in each group (red = RISK, black = CON). Response labels in parentheses mark the responses that did not remain statistically significant after corrections but still were p <.05 in uncorrected tests (see Table 4).

Table 6
Summary of the statistically significant (p <.05) main results of the linear mixed model (LMM) analyses for mean amplitudes (AMPL) and peak latencies (LAT) of the mismatch negativities (MMNs), positive mismatch responses (P-MMRs), and late discriminative negativities (LDNs) in the risk vs. control subgroups (effect: group) with effects of time (age in months, mo) and stimulus (deviant type duration, vowel, and/or frequency or all three) as fixed factors. Only the effects including group are reported.  Virtala, V. Putkinen, L. Kailaheimo-Lönnqvist et al. Clinical Neurophysiology 137 (2022) 159-176 polarity with maturation (e.g., He et al., 2007He et al., , 2009Trainor et al., 2003;Trainor, 2012). Nevertheless, it seems to be the most robust index of auditory change detection in infancy but not necessarily at 28 months (see below). However, the neonatal P-MMR peak was flat and broad in the present data, even resembling a lowfrequency drift (despite drift rejections, see 2.4). In fact, a stricter high-pass filter of 1 Hz seemed to abolish the P-MMR to some deviants (see Supplementary Figure S1; He et al., 2007). As the focus of the present study was on MMR maturation, and the a priori decided less strict 0.5 Hz high-pass filter that was used also in previous longitudinal studies (e.g. Choudhury and Benasich, 2011; see however Cheng and Lee, 2018;Háden et al., 2015;Fellman et al., 2004) seemed to work well overall, we chose not to re-analyze the data with a stricter filter. LDN. The LDN (also called LN/Nc/late MMN, e.g., Kushnerenko et al., 2002;Fellman et al., 2004;Shafer et al., 2011;Korpilahti et al., 1995) was elicited by duration and frequency deviants at 6 months and robustly by all three deviants at 28 months. Previous infant (e.g. Kushnerenko et al., 2002;Fellman et al., 2004;Martynova et al., 2003) and early childhood (e.g., Putkinen et al., 2012) studies have reported similar late negativities. The growth in its amplitude from 6 to 28 months is in line with previous findings that the LDN emerges during the first two years of life (e.g. Kushnerenko et al., 2002;Shafer et al., 2010), after which it should start to diminish with age (e.g., Cheour et al., 2001;Linnavalli et al., 2018). The increasing latency of the LDN with age in the present study is in contrast with at least one previous study (Courchesne, 1990), but could be attributable to the emergence of the MMN at an earlier latency, ''pushing" the LDN peak further (for a similar suggestion regarding the obligatory N2, see Choudhury and Benasich, 2011). The rather incidental finding of the 6-month LDN being statistically significant at posterior (centro-parietal rather than fronto-central, as suggested by previous work, see Cheour et al., 2001) electrode sites is compromised by the poor data quality of the 6-month EEG (see below). As the scalp distribution outside of the hemispheric lateralization was not in focus in the present study, future studies should investigate this question further.
Deviant type. Although differences between the three deviants were not of main interest in the present study, it is notable that the response complex elicited by them differed particularly at 28 months (see also Supplementary Tables S1-S3). Also, e.g., Putkinen et al. (2012) reported 1-3 different MMR response peaks in 2-3-year-olds in a multi-feature paradigm depending on the deviance type. It thus seems that in small children and maybe for several years in childhood (e.g. at 5-6 years, Linnavalli et al., 2018), the MMR is still developing. In the present study, the duration deviant seemed to elicit the most mature (MMN-P-MMR-LDN) response complex at 28 months (in line with Putkinen et al., 2012), while for the frequency deviant, the P-MMR and for the vowel deviant, the MMN were still absent. Particularly Cheng et al. (2013, Cheng and Lee, 2018 have reported that the elicitation of negative vs. positive MMRs during the first years of life may depend on the salience of the deviance in speech stimuli, so that acoustically large changes elicit an MMN from early on, while acoustically small changes only elicit P-MMRs (see also Kushnerenko et al., 2007;Peter et al., 2016). Following this line of thought, the duration and frequency deviants, requiring detection of one basic sound feature only, are acoustically more salient changes than the vowel deviant (a rather subtle change from /a/ to /o/), possibly explaining the absence of the vowel-MMN in the present data (for similar results in control group newborns see Thiede et al., 2019, partly overlapping data). Furthermore, as the longest stimulus causing the largest increase in sound energy, the duration deviant may be the most salient of the three, providing an explanation for its most mature response pattern at 28 months (however, based on visual inspection of Fig. 5, also its later latency of deviance onset may have affected its obligatory response and therefore the MMR morphology).
The vowel deviant may also be particularly challenging for the children in the present dataset due to the high proportion of dyslexia risk children (see below). Thus, the absence of a vowel-MMN at 28 months most likely reflects (1) the subtlety (low salience) of the change from /a/ to /o/, causing slower MMR maturation, and (2) the specific difficulties that dyslexia risk children may have with phoneme discrimination. Based on visual inspection of the control group frequency-MMRs at 28 months (Fig. 6), a small positive response can be seen between the MMN and LDN; therefore, also the absence of a frequency-P-MMR at 28 months may be partly attributable to the high proportion of dyslexia risk infants (see below). Future studies should acknowledge that the pace of maturation of the three MMR peaks in the early years may thus vary markedly according to both stimulus-and participantrelated factors.

MMRs and familial dyslexia risk
The diminished vowel-P-MMR obtained across the three ages in the dyslexia risk group was an expected result based on previous phoneme-MMN findings in infants (van Leeuwen et al., 2006(van Leeuwen et al., , 2008 and adults (e.g. Schulte-Körne et al., 2001;see, however Thiede et al., 2020) and the phonological deficit theory (Peterson and Pennington, 2015;Vellutino et al., 2004). For example, P-MMRs to consonant changes were diminished or absent in dyslexia-risk 2-month-olds (Dutch Dyslexia Programme, van Leeuwen et al., 2006;van Leeuwen et al., 2008; see also . The obtained enlarged duration-P-MMR in the dyslexia risk group was, however, in contrast with previous findings in infants (e.g., diminished/absent late negative MMRs, Leppänen et al., 2002) and children (school-aged: Corbera et al., 2006). Even so, somewhat similar findings have been reported from the Jyväskylä Longitudinal Study of Dyslexia. A positive MMR in dyslexia risk 6-month-old infants but not controls was found by Leppänen et al. (2002) to consonant duration changes. Pihko et al. (1999) reported a more positive deviant than standard ERP (suggesting a P-MMR) only in the dyslexia risk and not in the control group newborns in response to a vowel duration deviant (see also Leppänen et al., 1999).
In sum, previous results regarding the P-MMRs elicited by speech sound duration changes in dyslexia risk infants are, in fact, rather inconclusive, although basic auditory processing deficits in duration discrimination are expected in dyslexia (Hämäläinen et al., 2013). Should the enlarged P-MMRs in the dyslexia risk group be interpreted as enhanced processing of duration changes? Based on previous work (e.g. Choudhury and Benasich, 2011) and the present study, the P-MMR diminishes with age after 6 months, and thus a large response could be interpreted to indicate immature processing in the dyslexia risk group. Indeed, in older children (6-year-olds), an enlarged P-MMR to consonant changes was associated with dyslexia risk (Maurer et al., 2003). However, this interpretation seems to contrast with the diminished vowel-P-MMR in the dyslexia risk group in the present study. As discussed above, the least acoustically salient vowel deviant may be processed in an immature manner still at 28 months. Following this line of thought, in the control group infants, a large vowel-P-MMR could be a sign of accurate detection of a subtle change, while a small duration-P-MMR could be interpreted as mature processing of a salient change. Based on visual inspection and numerical amplitude values (but not compared between groups due to lack of statistical significance), the enlarged duration-P-MMR in the dyslexia risk group is also associated with a diminished MMN (Fig. 6). The only MMN that was statistically significant and therefore com-pared between groups, the frequency-MMN at 28 months, did not show group differences in amplitude. Still, several previous studies have reported diminished or absent MMNs in dyslexia risk infants and small children (Leppänen et al., 2002;Thiede et al., 2019 with partly overlapping newborn data to the present study; van Zuijen et al., 2012;Plakas et al., 2013). As the MMN emerges and P-MMR diminishes with increasing age in infancy (e.g. Slugocki and Trainor, 2014), an enlarged duration-P-MMR coupled with a diminished/absent MMN could be interpreted as immature processing in the dyslexia risk group.
The present findings demonstrated a statistically significant LDN at 6 months only in the dyslexia risk group. The result is in contrast with previous studies showing an LDN-like late negativity to a consonant change only in the control group at 5 months (Schaadt et al., 2015) and diminished LDNs in dyslexic schoolaged children (Halliday et al., 2014;Neuhoff et al., 2012). However, LDNs were enlarged in dyslexia-risk kindergarteners in one previous study (Hämäläinen et al., 2015). The LDN emerges during the first two years of life (e.g. Kushnerenko et al., 2002;Shafer et al., 2010), and then diminishes with age during childhood (e.g., Cheour et al., 2001;Linnavalli et al., 2018). The LDN amplitude increased more in the control than dyslexia risk group in the present data, which seemed to arise from (1) the absence of a statistically significant LDN in the control group at 6 months, i.e., LDN emerging in the control group by 28 months, and (2) numerically larger/more prevalent LDNs in the control than dyslexia risk group at 28 months (Fig. 6, Table 4). While the result suggests faster MMR maturation in the control group, the 6-month LDN in the dyslexia risk group only seems contradictory, as it could be interpreted as faster MMR maturation in the risk group. However, the results regarding the subgroups at 6 months should be treated with caution due to their smallest sample sizes and poorest data quality (see below).
An LDN in the dyslexia risk group only at 6 months and its steeper amplitude increase in the control than dyslexia risk group by 28 months were the only results related to dyslexia risk and frequency discrimination in the present study (except for a laterality effect in the control group only, see below). The existing literature in at-risk infants (diminished positive responses: Leppänen et al., 2010) and children (diminished MMNs: Maurer et al., 2003;Plakas et al., 2013; however see Hämäläinen et al., 2015) still suggests that along with duration discrimination, frequency discrimination consistently shows auditory deficits in dyslexia, although not in all individuals (review: Hämäläinen et al., 2013).
The maturation of MMR peak latencies demonstrated some group differences, too. Decrease of the P-MMR latency with age from birth to 6 months was larger in the dyslexia risk than control group, but the P-MMR peaked very late in the dyslexia risk group at birth. The LDN latency increase with age was statistically significant in the control group for both deviants, but in the risk group, only for the frequency deviant. However, the duration-LDN at 28 months did not reach statistical significance in the risk group and seemed more affected by dyslexia risk than frequency-LDN by visual inspection (Fig. 6).
No statistically significant differences between the left and right hemispheres in MMR amplitudes or their maturation were obtained across the whole sample in the present study. Only the frequency-MMN at 28 months demonstrated a group by hemisphere interaction effect, being right-lateralized in the control group only. The present dataset thus provides very little support for the left-hemispheric lateralization of speech-elicited MMRs already in early childhood (e.g. Kuuluvainen et al., 2016), or for the previous findings of atypical MMR lateralization in infants and children at dyslexia risk (Benasich et al., 2006;Choudhury and Benasich, 2011;Leppänen et al., 2002;van Leeuwen et al., 2008).
To conclude, the present results showed both diminished (vowel-P-MMR) and enlarged (duration-P-MMR) MMRs in infants at familial dyslexia risk consistently across ages 0, 6, and 28 months. Therefore, our results do not support a previous notion that auditory processing deficits in dyslexia risk would be most evident at birth and then attenuate during early development (see Galaburda et al., 2006;Hämäläinen et al., 2013). While group differences were also seen in MMR amplitude and latency maturation, the same responses often were statistically significantly obtained in one but not the other group at a certain age, complicating the interpretation of these results. On the other hand, as MMR elicitation within the groups was analyzed with rather conservative corrections for multiple comparisons, differences in statistically significant MMR elicitation may also or rather reflect differences in sample size, data quality, or robustness of the response (how consistently it was elicited at individual level). For example, the duration-MMNs at 0 and 28 months remained nonsignificant in both groups and were therefore not compared between groups, although both groups seemed to demonstrate an MMN that was numerically larger in the control than dyslexia risk groups (Fig. 6, Table 4).

Limitations and considerations for future studies
When interpreting the present results on MMR elicitation and maturation, as mentioned above, it is important to note that the large proportion (approximately ¾) of dyslexia risk infants in the sample may have affected the findings. We nevertheless chose to first analyze the whole sample, in order to maintain a large sample size often missing from longitudinal MMR studies. Based on the group difference results, dyslexia risk may have particularly enlarged the duration-P-MMR and diminished the vowel-P-MMR responses in the present data. It is also possible that the 6-month LDN was visible in the sample mainly due to the dyslexia risk infants (however, see below for a discussion of data quality at 6 months). As MMN may be diminished or absent in dyslexia risk infants based on previous studies and visual inspection of the present data (see above), it is possible that in a larger sample of only control infants, MMN would have been more pronounced and statistically significant to both duration and frequency changes already at birth.
Regarding the results obtained on the effects of familial dyslexia risk on speech-elicited MMRs, it is noteworthy that absent, present, diminished, and enlarged MMRs have all been associated with dyslexia risk in the present as well as previous studies (see above). All of these results are typically interpreted as ''worse" (less accurate or immature) processing as a result of the familial dyslexia risk. The on-going follow-up of these children will help in interpreting the early group differences obtained here as positive or negative predictors of future development and therefore hopefully guide future research.
Special age-related issues should be taken into account when interpreting the present results and planning future studies. First, MMRs of mostly asleep newborns are here (according to standard practices of the research field) compared to the MMRs obtained from awake older infants and children. These differences in alertness can contribute to the differences obtained between age groups in MMR morphology, amplitudes, and latencies. Furthermore, alertness state may affect neonatal MMRs (e.g., Friederici et al., 2002), but in the present study (and in several recent infant MMR studies, e.g., Háden et al., 2015), data from all alertness states was combined in order to ensure a large dataset. Importantly, dyslexia risk and control groups did not differ in the proportions of different alertness states during the recordings (see Section 2.3).
Second, the 6-month-olds demonstrated poorest data quality by visual inspection and based on the high amounts of excluded infants and rejected epochs (see Section 2.4). This should be kept in mind particularly when interpreting the subgroup (dyslexia risk vs. control) results in the 6-month MMRs. The LMM method adopted in the present study allowed for longitudinal investigations across all the three age groups despite the high amount of missing data at 6 months. Still, a large sample size is particularly important in the ERP recordings of infants around 0.5-1.5 years, who are already mobile but very limited in their capabilities to follow verbal instructions, maintain attention, or simply stay silent or stay put. Third, 2-3-year-old children may refuse participation due to fear or discomfort related to the EEG equipment and laboratory space. This was avoided very successfully in the present study with a carefully designed protocol including, e.g., a visualized ''storybook" to familiarize the child with the method (see Section 2.3).

Conclusions
The present study, with its large longitudinal sample, validates previous findings on the maturation of the auditory changeelicited ERPs as follows: a broad positivity (MMR, P-MMR, DP, or P3a) is the most prevalent MMR in infancy, whereas during the first years of life, it starts to diminish in both amplitude and latency. An early negativity, MMN, emerges preceding it, growing in amplitude and decreasing in latency with maturation, but it may not be such an ontogenetically early or developmentally stable response as previously suggested. A late negativity (LN, Nc, late MMN, or LDN) emerges following the positivity and is a prominent MMR in early childhood. It may grow in both amplitude and latency during its early maturation. Future studies should acknowledge the notable effects that participant-and stimulusrelated factors may have on MMR elicitation in the early years, when its three peaks are still maturing. The present results also demonstrate the multi-faceted effects that familial dyslexia risk can have on neural auditory (speech) discrimination in the first years of life, providing a starting point for future longitudinal investigations of these children and their developmental outcomes.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.