Neural processing of changes in phonetic and emotional speech sounds and tones in preterm infants at term age

Objective: Auditory change-detection responses provide information on sound discrimination and memory skills in infants. We examined both the automatic change-detection process and the processing of emotional information content in speech in preterm infants in comparison to full-term infants at term age. Methods: Preterm (n = 21) and full-term infants' (n = 20) event-related potentials (ERP) were recorded at term age. A challenging multi-feature mismatch negativity (MMN) paradigm with phonetic deviants and rare emotional speech sounds (happy, sad, angry), and a simple one-deviant oddball paradigm with pure tones were used. Results: Positive mismatch responses (MMR) were found to the emotional sounds and some of the phonetic deviants in preterm and full-term infants in the multi-feature MMN paradigm. Additionally, late positive MMRs to the phonetic deviants were elicited in the preterm group. However, no group differences to speech-sound changes were discovered. In the oddball paradigm, preterm infants had positive MMRs to the deviant change in all latency windows. Responses to non-speech sounds were larger in preterm infants in the second latency window, as well as in the first latency window at the left hemisphere electrodes (F3, C3). Conclusions: No significant group-level differences were discovered in the neural processing of speech sounds between preterm and full-term infants at term age. Change-detection of non-speech sounds, however, may be enhanced in preterm infants at term age. Significance: Auditory processing of speech sounds in healthy preterm infants showed similarities to full-term infants at term age. Large individual variations within the groups may reflect some underlying differences that call for further studies.


Introduction
Preterm birth increases the risk of abnormal neurodevelopment, especially in preterm infants born at low gestational weeks or born with intrauterine growth restriction. In addition to major neurological deficits, adverse minor cognitive dysfunction, and learning difficulties may exist at later ages (Mikkola et al., 2005(Mikkola et al., , 2007Jarjour, 2015). The risk for cognitive dysfunction in preterm infants can manifest as divergences in auditory discrimination skills during the first year of life (Fellman et al., 2004). Accurate discrimination of sounds and the ability to process various changes in speech are essential for normal language development. This auditory change-detection processing can be studied in the frontal and central areas of the cortex (Näätänen et al., 1978;Näätänen, 1990;Näätänen andAlho, 1995, 1997).
Even though both negative and positive MMRs can occur concurrently during infancy (Morr et al., 2002;He et al., 2007), they reflect different neural processes. Several factors can influence the polarity of the responses, such as gestational age (Leppänen et al., 2004) and sleep stage (Cheour et al., 2002a). Furthermore, stimulus characteristics may have an impact on the polarity of the MMRs (Cheour et al., 2002b;He et al., 2007;Háden et al., 2009;Cheng et al., 2015). Auditory ERPs change during the maturation of speech discrimination in childhood. The positive MMRs decline in amplitude and gradually change from surface-positivity to surface-negativity. For example, in the study by Cheng et al. (2015) responses to vowel change switched from positive MMR in newborns to negatively displaced MMN at six months of age. Positive MMRs have been demonstrated to shift into an adult-like MMN typically by the age of 7 (Shafer et al., 2000(Shafer et al., , 2010. Several studies have examined whether preterm infants' increased risk for language problems could be seen at an early stage of development. Fellman et al. (2004) examined the association between early auditory ERPs and cognitive development in preterm infants and found that responses to changes in tone frequency (standard 500 Hz, deviant 750 Hz) in preterm infants at term age were similar to full-term infants. However, three months later, the preterm infants had significantly lower responses than those of term-born infants, which also correlated with the Bayley developmental index at two years of age. Jansson-Verkasalo et al. (2010) studied speech discrimination abilities during the first year of life in preterm infants and found atypical perceptual narrowing for non-native phonemes associated with later language problems. Furthermore, Hövel et al. (2014) measured auditory ERPs in preschool children who had been born very preterm and found decreased P1 responses similar to those of children with autism spectrum disorders (Jansson-Verkasalo et al., 2003), ADHD (Kemner et al., 1996) and children with an increased risk for dyslexia (Lovio et al., 2010). Finally, Mikkola et al. (2007) examined auditory ERPs of preterm infants at five years of age and similarly found small P1 responses to frequency and duration changes that they suggested were a sign of diverse auditory processing. Taken together, preterm birth affects the development of the auditory system.
In the present study, our first aim was to further investigate the effect of preterm birth on auditory processing by studying the neural processing of speech-sound changes in preterm infants in comparison to their full-term peers at term age. Furthermore, studies have shown that infants can extract prosodic information from speech already shortly after birth (Sambeth et al., 2008), and the prosodic features in speech, such as intonation, stress, tone, and rhythm, enhance their language acquisition (Werker et al., 2007;Thiessen et al., 2010;Adriaans and Swingley, 2017). Accordingly, our second aim was to examine the processing of emotional information in speech sounds between these two infant groups. To study these questions, we utilized a simple onedeviant oddball paradigm with pure tones and a more challenging multi-feature MMN paradigm with phonetically and emotionally relevant speech-sound changes. With these two different paradigms, we were able to extract information about change-detection processing on both speech sounds and non-speech sounds.
Considering the previous findings by Fellman et al. (2004), we did not expect to find significant group differences in the neural processing of non-speech sounds in the oddball paradigm at term age between these two infant groups. As there have been no previous studies examining discrimination skills of speech-sound changes in preterm infants at term age, we wanted to determine if the neural processing of changes in speech sounds in the multi-feature MMN paradigm would differ between preterm and full-term infants at term age.

Participants
Preterm infants (n = 21, 10 male) and full-term infants (n = 20, 11 male) participated in this study after a written informed consent of their parents. The infants were born to Finnish-speaking families, their physical condition was stable, and they had no major neuropathological findings (Table 1 provides the birth characteristics of the infants). The preterm infants received standard care on the neonatal ward, including daily skin-to-skin care conducted by either parent. The infants' hearing was verified on the neonatal ward with an otoacoustic emission screening (MADSEN AccuScreen, Budapest, Hungary) before discharge. The data of one preterm infant was omitted from further analysis due to incomplete data files. The study was approved by the Ethics Committee of the Hospital District of Helsinki and Uusimaa (Ethics Committee for gynecology and obstetrics, pediatrics, and psychiatry, 65/13/03/03/ 2012), and by Helsinki University Central Hospital.

Multi-feature MMN paradigm
The multi-feature MMN paradigm was originally developed by Näätänen et al. (2004, Optimum-1). The multi-feature MMN paradigm used in this study was partially the same as in Pakarinen et al. (2014) and consisted of a 336 ms standard stimulus, a Finnish naturally uttered bi-syllabic pseudo-word /ta-ta/, (46% probability, 700 in a stimulus block), and six phonetic deviants. The deviants were: change in vowel duration (/ta-ta:/, 11% probability, 175 in a stimulus block); vowel change (/ta-to/, 11% probability, 175 in a stimulus block); intensity changes ( ± 6 dB, 5% probability each, 77 each in a stimulus block); and frequency changes ( ± 25.5 Hz, 6% probability each, 98 each in a stimulus block). In addition, the paradigm included three emotionally uttered /ta-ta/ stimuli (happy, sad, angry) rarely appearing in the recording (3% probability each, 42 each in a stimulus block). The sounds were presented with a 650 ms stimulus onset asynchrony (SOA; onset to onset), in two stimulation blocks. In the blocks, every other sound was a standard, and every other sound was either a deviant or one of the three emotionally uttered sounds.
As typical to the Finnish language, the stress in the standard stimulus was on the first syllable, which was followed by a slightly falling intonation in the second syllable. The vowel duration (/ta-ta:/) and a 4 of the preterm infants were born as small for gestational age (SGA) with a birth weight of less than −2 standard deviations for the age. b newborn health assessment on a scale 1-10 (Heart rate, respiratory, muscle tone, reflex and color, assessed 10 min postnatally).
K. Kostilainen, et al. International Journal of Psychophysiology 148 (2020) 111-118 vowel change (/ta-to/) deviants were natural utterances and thus physically differed from the standard in both syllables. The intensity changes ( ± 6 dB) and frequency changes ( ± 25.5 Hz), in turn, were digitally modified from the standard stimulus and differed in the second syllable only. The emotional speech sounds were prosodically exaggerated natural utterances, including clear changes in natural prosodic features, such as timbre, tone, intonation, and rhythm. They differed from the deviants and the standard stimulus in both syllables (Pakarinen et al., 2014). (See Fig. 1 and Table 2 for detailed information, and supplement for an audio clip of the paradigm. Spectrograms of the standard, deviant and emotional sound stimuli can be found in the Supplementary Fig. 1.) The emotional sounds were rated by adult listeners (n = 5), using a chart with five basic emotions (happiness, anger, fear, sadness, shame) to confirm that the emotional sounds were perceived as indented (See Supplementary Table 1).

Oddball paradigm
The traditional one-deviant, oddball paradigm consisted of a standard tone of 1000 Hz, 100 ms in duration, (80% probability, 800 per stimulus block) and a deviant tone of 1100 Hz (20% probability, 200 per stimulus block). The sounds were pure tones, presented with an 800 ms SOA, in one stimulation block. In the sequence, the sounds were presented pseudo-randomly so that no deviant sounds appeared consecutively and at least one standard sound was always presented between two deviant sounds. (See Fig. 1 and Table 2 for detailed information, and supplement for an audio clip of the paradigm.)

EEG recording and data analysis
The auditory ERPs were recorded on average at the age of 40 weeks of gestation (range 38-42) by a registered research nurse. The infants were mainly in active or quiet sleep during the measurement. Stimuli were presented via one loudspeaker placed 50 cm behind the head of the infant. The EEG was recorded from the electrodes F3, F4 (frontal) and C3, C4 (central), using the International 10-20 System electrode locations (Low cutoff DC, high cutoff 100 Hz, sampling rate 250 Hz). Data were high-pass filtered at 1 Hz and low-pass filtered at 20 Hz. The EEG was referenced online to the left mastoid electrode and re-referenced offline to the mean value of the left and right mastoid. The data were cut into epochs starting 100 ms before the stimulus onset and   Kostilainen, et al. International Journal of Psychophysiology 148 (2020) 111-118 ending 650 ms after stimulus onset. The epochs were baseline corrected to the mean value of the signal for the period of 100-0 ms before the stimulus onset. Epochs with signal values at any channel larger than ± 150 μV were rejected from further analysis. The accepted epochs of each stimulus type (Table 3) for each electrode were averaged together for each participant and then averaged together over all participants to form the grand averages. To obtain the latencies of interest, we first conducted point-by-point t-tests to compare responses to the standard and each of the deviant and emotional sounds. For this analysis, we used the average of four electrodes (F3, F4, C3, and C4; see Supplementary Figs. 2 and 3). Based on of both this data and previous EEG studies on preterms (e.g., Fellman et al., 2004), the latencies of interest were set as follows: In the multi-feature paradigm intervals between 200 and 300 ms, 400-500 ms and 550-650 ms for the emotional sounds and 400-500 ms and 550-650 ms for the deviants were chosen. In the oddball paradigm, the same intervals between 200 and 300 ms, 400-500 ms and 550-650 ms for both groups were chosen. After this, the averaged values of the MMR responses in these latency windows from the four electrodes were collected for further analysis.
The data were analyzed using SPSS 25 (IBM Corporation, NY, USA). We used the two-tailed t-test when testing which specific standardsubtracted mean amplitudes from the electrodes F3, F4, C3, and C4 differed significantly from 0 μV in the above-mentioned time windows. To evaluate the between-condition comparisons, repeated-measures ANOVA was used with between-group factors of Group (preterm, fullterm) and Gender (female, male), as gender might affect the brain responses (e.g., Kostilainen et al., 2018;Shafer et al., 2011). Variant (9; 6 phonetic deviants and 3 emotional stimuli) and Electrode (F3, F4, C3, C4) were used as within-group factors. We used Greenhouse-Geisser correction when applicable, and Bonferroni correction in post hoc -tests; only corrected values are reported (uncorrected degrees of freedom are reported). Effect sizes for all ANOVAs are reported using partial eta squared (η 2 ). The main effects of the Electrode in the ANOVAs were omitted from the results section. The relationships between the MMR amplitudes and the gestational age and weight at birth in preterm infants were examined with the Pearson correlation test.

Multi-feature MMN paradigm
Positive MMRs mainly to the emotional stimuli but also some of the phonetic deviants were found at all four electrodes in both infant groups (Fig. 2). Detailed information of the one-sample t-test results (tvalues, p-values, mean amplitudes, standard deviations, and 95% confidence intervals) are reported in the Supplementary Table 2. In the early latency window, 200-300 ms, the repeated-measures ANOVA showed that the mean amplitudes of preterm group infants were larger than those of full-term group infants (preterm group 2.591 μV, full-term group 1.136 μV), however the group difference was not statistically significant [F(1, 36) = 3.131, p = .085, η 2 = 0.080]. Furthermore, a main effect of Variant was discovered [F(8, 29) = 5.796, p = .005, η 2 = 0.139], due to the emotional variant happy eliciting the largest mean amplitudes (happy 3.074 μV, sad 1.362 μV, angry 1.154 μV), and statistically differing from the emotional variant angry (p = .005).
In the latency window 400-500 ms, ANOVA revealed a main effect of Variant [F(8, 29) = 8.301, p < .001, η 2 = 0.187], resulting in the responses for the emotional sounds happy and angry being statistically larger than the phonetic deviants in almost all of the comparisons (see Supplementary Table 3). Moreover, there was an interaction effect between Electrode and Variant [F(24, 13) = 3.007, p = .001, η 2 = 0.077], as a result of the largest amplitudes being elicited in the frontal electrode line (F3 and F4, see Supplementary Table 2). In the later latency 550-650 ms, a main effect of Variant [F(8, 29) = 7.282, p < .001, η 2 = 0.168] was significant in the ANOVA analysis, due to emotional variants happy and especially angry being statistically larger Table 3 The accepted epochs of each stimulus in both paradigms.
a) The accepted epochs for the standard, deviants, and emotional sounds in the multi-feature MMN paradigm (two stimulation blocks combined). (caption on next page) K. Kostilainen, et al. International Journal of Psychophysiology 148 (2020) 111-118 than those of the phonetic deviants (see Supplementary Table 3). Furthermore, there was an interaction effect between Electrode and Variant [F(24, 13) = 2.177, p = .018, η 2 = 0.057], indicating that the largest amplitudes for the sounds were elicited in the frontal line electrodes F3 and F4 (see Supplementary Table 2). The Pearson correlation test showed in the early latency window, 200-300 ms, that the gestational age at birth was negatively correlated with the MMR amplitude of the emotional stimulus happy at only one electrode F4: (r = −0.475, p = .034). In the latency window, 400-500 ms, the gestational age at birth in the preterm group infants did not correlate with any of the deviant nor emotional stimuli. In the later latency window 550-650 ms, the gestational age at birth correlated negatively with the phonetic deviant intensity change (+6 dB) at the central electrode C3: (r = −0.506, p = .023). The preterm birth weight did not correlate with any of the responses in any of the latency windows.

Oddball paradigm
In the oddball paradigm, positive MMRs for the deviant change were found in the preterm group at the electrodes F3, F4, and C3 in all latency windows 200-300 ms, 400-500 ms, and 550-650 ms (Fig. 3). In contrast, MMRs to the deviant change in the full-term group were found only in the last time window 550-650 ms at the electrodes F3 and C3. The detailed results of the one-sample t-test, including t-values, p-values, mean amplitudes, standard deviations, and 95% confidence intervals can be found in Supplementary Table 2. In the early latency window, 200-300 ms, the ANOVA revealed an interaction effect between Group and Electrode [F(3, 34) = 3.086, p = .045, η 2 = 0.079], resulting in the mean MMR amplitudes in the preterm group infants being statistically larger than those of the full-term group infants and differing from each other at the left hemisphere electrodes F3: p = .022 (preterm group 2.525 μV, full-term group 0.512 μV) and C3: p = .039 (preterm group 2.196 μV, full-term group 0.056 μV). In the second time window 400-500 ms, the repeated-measures ANOVA showed a main effect of Group [F(3, 34) = 4.244, p = .047, η 2 = 0.105], due to the preterm group infants' responses being larger than those of the full-term group infants (preterm group 2.413 μV, full-term group 0.468 μV).
The results of the Pearson correlation test showed only one positive correlation between the birth weight and the MMR amplitude to the deviant change at the electrode F4: (r = 0.491, p = .028). No other correlation between the MMR amplitudes and the gestational age at birth or the birth weight in the preterm group infants in any of the latency windows were found.

Discussion
In the present study, we examined the automatic change-detection process, and the processing of emotional speech sounds in preterm infants in comparison to full-term infants at term age. The gestational age at birth or the birth weight in the preterm infants were not connected to the magnitude of the responses at term age. The results demonstrated that both preterm and full-term infants had prominent positive MMRs to the emotional sounds in the challenging multi-feature MMN paradigm. For the phonetic deviants, both infant groups showed few positive MMRs in the second latency window. However, in the later latency window, 550-650 ms, preterm infants elicited more positive MMRs to the phonetic deviants, unlike the full-term infants. Nevertheless, no group differences in the neural processing of speech-sounds at term age were found in any of the latency windows. Large individual variations within the groups, as well as qualitative group-level differences in the data, may reflect some underlying differences that call for further studies.
The emotional stimuli in the multi-feature MMN paradigm elicited clear positive MMRs that peaked approximately 250 ms after change onset, in both preterm and full-term groups. In the emotional stimuli, these positive amplitudes were followed by later positive deflections that peaked approximately 450 ms and 600 ms after change onset in both preterm and full-term groups. In the phonetic deviants, positive MMRs were elicited approximately 350 ms after change onset, and they were more prominent in the preterm infant group. The emotional stimuli elicited the largest amplitudes in both infant groups, and the most robust brain responses were elicited by the emotional stimuli happy and angry. In this paradigm, the emotional stimuli evoked larger responses for two possible reasons. First, the acoustic features in the emotional sounds such as intonation, stress, and intensity most likely make those sounds more distinguishable in the sound stream. Thus, these acoustical differences between emotional and standard stimuli are larger than the differences between deviant and standard stimuli, which make the detection of the changes easier. In addition to this, emotional stimuli appear less frequently in the sound stream when compared to the standard and the deviants, making those sounds more unexpected for the auditory system and, therefore, evoking larger MMRs.
Larger amplitudes for the emotional sounds could also be partially explained by the composition of the acoustic features and infants' responsiveness to affective prosody. Affective prosody is an important Fig. 2. The waveforms for the standard stimulus, and the standard-subtracted waveforms for the emotional and deviant stimuli in the multi-feature MMN paradigm at the electrode F4 (y-axis is reversed with negative at the top). The thick line represents the mean amplitude of the preterm group and the dotted line represents the mean amplitude of the full-term group. The examined latency windows are highlighted with in grey area, and statistically significant responses are marked as * p < .05, ** p < .01, *** p < .001. No group differences were found in the multi-feature MMN paradigm. Fig. 3. The waveforms for the standard stimulus, and the standard-subtracted waveforms for the deviant stimulus in the oddball paradigm at the electrode F4 (y-axis is reversed with negative at the top). The thick line represents the mean amplitude of the preterm group and the dotted line represents the mean amplitude of the full-term group. The examined latency windows are highlighted with in grey area, and statistically significant responses are marked as ** p < .01, *** p < .001. In the oddball paradigm, a group difference was found in the second latency window 400-500 ms (and in the first latency window 200-300 ms at the electrodes F3 and C3).
factor of language learning, as it has been pointed out that newborns are sensitive to prosodic cues in speech (Sambeth et al., 2008), and that statistical structuring of speech is enhanced when emotional speech features are exaggerated (Bosseler et al., 2016). Furthermore, our current findings support the results of our previous study (Kostilainen et al., 2018), where we suggested that newborn infants' brains respond pre-attentively and automatically to emotionally uttered speech. This response might be caused by the acoustical differences in emotional or infant-directed speech that differentiate from continuous, adult-directed speech.
As the prosodic features in speech guide infant attentiveness towards speech perception, they might be specifically predisposed to process these acoustic features, resulting in larger MMRs. Hence, it should be considered whether the prominent positive peak at 250 ms after stimulus onset to the rare emotional sounds in the multi-feature paradigm could be a P3a type of response and reflect involuntary orienting. The P3a response, which is a positive peak elicited in the frontal and central areas, at 200-400 ms after stimulus onset to a novel sound (Squires et al., 1975), can exist already in early stages of development as studies have shown P3a-like responses to novel sounds in newborn infants (Kushnerenko et al., , 2013Háden et al., 2009). Nevertheless, in order to better understand what specific features in the emotional stimuli lead to larger MMR amplitudes, the stimulus characteristics need to be more carefully controlled.
In the simple oddball paradigm, the preterm infants had positive MMRs to the deviant change in all three latency windows, whereas the full-term infants had only two significant responses in the last latency window. There was a statistically significant interaction effect of Group and Electrode in the early latency window 200-300 ms, due to significantly larger brain responses in the preterm infants at the left hemisphere electrodes (F3, C3). In addition to this, a main effect of Group was found in the second latency window 400-500 ms, resulting in statistically larger amplitudes in preterm infants when compared to those of full-term infants. According to these results, an actual difference in the neural processing of non-speech sounds between preterm and full-term infants may already be observable at term age.
In contrast to Fellman et al.'s (2004) findings, our results suggest that the neural processing of non-speech sounds may be enhanced in preterm infants in comparison to full-term infants at term age. Our findings were in line with our previous study (Kostilainen et al., 2018), where we studied healthy newborns' auditory and emotion processing with the same paradigm and found significant responses similarly to only a few emotional sounds. It seems that this kind of paradigm, consisting of multiple deviant changes with rare sounds and a rapid presentation rate, might be too challenging for the infants to detect the smaller phonetic changes, leading to the detection of only the most distinct emotional sounds. For full-term infants, however, the paradigm alone is not the only reason for the absence of MMRs, considering that full-term infants also lacked significant responses in the simple onedeviant oddball paradigm.
Taking into consideration that MMRs in full-term infants have been reported in previous studies using both oddball (Morr et al., 2002;Leppänen et al., 2004;Cheng et al., 2015) and multi-feature (Partanen et al., 2013) paradigms, the individual variations between participants might explain the lack of MMR in the full-term group. However, there are also studies reporting an absence of MMR in newborns (Ceponiené et al., 2002;Cheour et al., 2002b). These inconsistencies between infant studies highlight the incidence of large variation in newborn data, which can easily decrease the number of significant responses at the group level (Sambeth et al., 2006;Cheour et al., 1998Cheour et al., , 2002cKostilainen et al., 2018).
While the preterm infants in the current study elicited more positive MMRs than the full-term infants, the reason for this is not entirely clear. There have been proposals that, during early development, a negative MMR component appears (see, e.g., He et al., 2007). Thus, it could be that the more positive MMRs in the preterm group would be indicative of a more immature neural processing while in the full-term group, a small negative MMR, coinciding with the positive MMR, reduced the overall response magnitude. Alternatively, it may be that due to exposure to the extrauterine environment, the preterm group infants are more attentive towards sounds, according to the model proposed by Kushnerenko et al. (2013). However, currently, the reason for this difference remains unclear. Even in full-term infants, responses of opposite polarity have been reported in studies using similar contrasts (see, e.g., vowel contrast in Finnish newborns with Cheour et al. (1998) reporting negative MMRs and Partanen et al. (2013) reporting positive MMRs).
In conclusion, the neural processing of speech-sound changes did not statistically differ between preterm and full-term infants at term age. In the oddball paradigm, however, our results suggest that the differences in the early auditory environment might have affected the maturation of the auditory system in preterm infants already before term age since group differences were found for non-speech sounds. Further studies examining the early auditory development of preterm infants are needed, as well as studies investigating the effects of the early auditory environment in neonatal care on the maturation of the auditory system and discrimination skills in infants after preterm birth. Moreover, a long-term follow-up of the early brain responses in comparison to other developmental indexes, such as linguistic skills, should be evaluated to better understand the connection between early ERPs and cognitive development in infancy after preterm birth.

Declaration of competing interest
None of the authors have potential conflicts of interest to be disclosed.