Relationship between speech production and perception in children with Speech Sound Disorders

Larissa Cristina Berti1, Jhulya Guilherme2, Cássio Esperandino2 and Aline Mara de Oliveira3 1 Department of Speech, Language and Hearing Sciences, São Paulo State University, Universidade Estadual Paulista, BR 2 Programa de Pós-Graduação em Fonoaudiologia, São Paulo State University, Universidade Estadual Paulista, BR 3 Department of Speech, Language and Hearing Sciences, Santa Catarina Federal University, Universidade Federal de Santa Catarina, BR


Introduction
During the process of phonetic-phonological acquisition, researchers highlight the role performed by children's articulatory and auditory skills, besides the sensory-motor connections that underlie such a process (Munson, Edwards & Beckman, 2005;Howard, 2010;Panneton & Newman, 2011). Therefore, language contrasts are perceived and produced whenever learners master perceptual and articulatory skills.
Every language has specific sets of phonological contrasts that provide informative aspects of the linguistic system. There is a specific standardization of which and how many segments may occur within a syllable. In the case of Brazilian Portuguese (BP), the syllables have the maximal structure C 1 C 2 VVC 3 C 4 . At least a vowel should occur in a BP syllable. BP has a vocalic inventory composed of seven vowels (/i, e, ɛ, a, ɔ, o, u/). Five these vowels (/i,e, a, o, u/) can be nasalized phonetically or phonologically. When two vowels co-occur, one will be a glide (/y/ or /w/), which may precede or follow the other vowel. In the case of two co-occuring prevocalic consonants, C 2 will necessarily and Bishop (1992) claimed that children with SSD showed some ability to discriminate contrasts they could not produce. Similarly, Nagao et al. (2012) examined the relationship between speech perception and speech production in children with SSD. They observed a considerable variation in perceptual performance, so that children with SSD can be classified into different subgroups based on speech production and speech perception measures. A specific group of those children might have poor perceptual abilities to identify phonemic sounds, even though they do not commit articulation errors.
In a recent study (Hearnshaw, Baker & Munro, 2018), the authors presented the differences in perceptual performance in children with and without SSD's in a more sophisticated way: in terms of the relationship between overall of speech perception accuracy and speech production abilities using the same target words, as well as in terms of the relationship between overall perceptual accuracy and the proportion of speech perception of target words produced correctly. Summarizing the results, the authors reported that children with SSD perceived speech less accurately than their peers presenting typical development, despite the large variability in the group of children with SSD. They also found a positive correlation between the overall speech perception and speech production scores. Still, there was no significant relationship between the children's abilities to produce and perceive the targeted four specific phonemes accurately, that is, there was no univocal relationship between speech errors and perception errors.
Given the contradictory results obtained so far, it seems to be necessary to extend the investigation concerning the perceptual performance of children with SSD to comprehend the relationship between speech production and speech perception in this group of children. The present study will extend the literature in two ways. First, we will describe the perceptual and the production performance of a group of children with SSD (n = 33). Secondly, we will consider the 19 consonantal phonemes of BP, which may occur in C1 position, in the analysis of both speech production and speech perception. It is important to highlight that the findings reported above came from studies involving in Englishspeaking children. In addition to that, the studies investigated a very restricted set of contrasts of English. These studies often assess perceptual ability, using standardized tests with target phonemes that are not related to the child's articulation errors. In general, standardized speech perception tests have been through too few trials to guaranntee a reliable assessment of the child's ability. In this study, differently, we will explore all consonantal phonemic contrasts of BP within the phonemic classes: stops, fricatives, and sonorants (nasals and liquids). To date, we do not have data correlating speech perception and speech production for BP.
This study has the original purpose of investigating the relationship between speech production and speech perception in children with SSD speaking Brazilian Portuguese. Notably, we will compare and correlate the speech and general perceptual performances in children with SSD; and investigate the correlation between speech production and speech performance errors according to the phonemic class.
Assuming that children's perception of speech sounds is a critical variable influencing the way these sounds are produced, the hypotheses are: H1: positive correlation between speech and perceptual data for children with SSD; H2: difference between production and perception performance scores by children with SSD; H3: positive correlation between speech production and speech perception errors could depend on the phonological class.  Paulo -Brazil). Both the children and their parents and or legal guardians agreed and signed the Informed Consent Form (ICF). This study was based on guidelines and regulatory standards for research involving human beings as determined by the National Health Council resolutions 466/12 and 510/16, respectively.

Participants
Thirty-five children diagnosed with SSD (A age = 68 months, sd = ±12,25), assessed by a Speech-Language Pathologist, were recruited at the Speech-Language Therapy Clinic of the UNESP (Marília -Brazil).
The inclusion criteria to select the children was: children with SSD without the presence of comorbidities, such as the presence of language impairment or the presence of anatomical and morphological alterations, which impaired speech production process (e.g., cleft lip and palate); while the exclusion criteria was the presence of otological/hearing disorders. All children were monolingual speakers of Brazilian Portuguese.
A speech-language pathologist carried out speech-language and hearing screenings to identify possible alterations in spoken language, voice, orofacial motricity, and otological/hearing alterations. For the speech-language testing, specific protocols were used, while for the hearing screening, the Interacoustic AD-28 audiometer was used with TDH-39 headphones in an acoustic booth. The frequencies of 1000, 2000, and 4000 Hz were investigated at an intensity of 20 dB HL (decibel hearing level).
Among the 35 children recruited, two children presented alterations in the hearing screening and were referred for specific services at the Speech-Language Therapy Clinic of the UNESP (Marília), and e excluded from this study. Chart 1 shows the characterization of the participants.

Speech Production Task
Each child performed a picture naming task where the IAFAC instrument (Berti, Lacava & Pagliuso, 2009) was used as an assessment of speech production. The IAFAC consists of 96 words represented by corresponding pictures. 2 This instrument makes the analysis of the phonological and phonetic system possible, since it includes all the 19 consonant phonemes of Brazilian Portuguese in simple syllabic onset (six stops: /p,b,t,d,k,g/, six fricatives: /f,v,s,z,∫,ᴣ/, three nasals: /m,n,ɲ/, two laterals: /l,ʎ/, two non-laterals: /ɾ,R/). Through a recreational activity, the children were shown the pictures and asked to say the target word aloud. The naming task was recorded with the help of a digital recorder (Marantz, type PMD 670), attached to a cardioid dynamic vocal microphone (SHURE, type 8800).
The recordings were trimmed using PRAAT (Boersma & Weenink, 2019) to isolate each word produced by each child. The trimming process was performed by a trained research assistant using PRAAT in order to create the stimuli for the perceptual experiments. The cuts were made by identifying a portion before and after the onset of the speech waveform.
The production data were transcribed and judged independently by four Speech-Language Pathologists (SLPs). A miminum agreement percentage of 75% among the judgments of each production segment was established for final transcription. children with Speech Sound Disorders Art. 13, page 5 of 13 Chart 1: Characterization of the participants (n = 33). The PCC-R value (Percentage of Consonant Correct Revised, proposed by Shriberg, Austin, Lewis, McSweeny, & Wilson (1997) was then calculated for each child, corresponding to the performance score of their speech production.

Speech perception task
An assessment of auditory perception (focusing on the identification of phonic contrasts in BP) was performed using the PERCEFAL instrument (Berti, 2017a). The PERCEFAL instrument comprises a subset of four experiments that evaluate the identification of contrasts among vocalic phonemes separately, and among stops, fricative and sonorant consonant phonemes. For this study, we used the tests involving consonantal contrasts: stops, fricatives, and sonorants.
The perception experiments were composed of an identification task (also known as a forced-choice minimal-pair identification task) involving the phonemic contrasts in separate consonantal classes.
The perceptual experiments included three stages: word recognition, training, and testing, with an approximate 15-minute overall duration for each experiment. All steps were run by the PERCEVAL software (Perception Evaluation Auditive & Visuelle) (André, Ghio, Cavéc, & Teston, 2009), in which the presentation times of the stimuli were controlled.
The word recognition stage consisted of presenting the visual and auditory input (cues) to the children to verify whether they knew the words and or pictures used in the experiments. After the children were familiar with the experiment input (the cues), we checked whether they knew the words. A threshold of 80% of correct answers would lead the children to the training and testing stages. The training stage was carried out automatically by the software and aimed to enable the participants to understand and become familiar with the task. This step consisted of a perceptual identification task, but the results were not computed. The stimuli were randomized, and 10 presentations were selected. Afterwards, we began the testing stage.
For the testing stage, the children were comfortably placed in front of a computer screen inside an acoustic booth (with the software PERCEVAL installed) and wore KOSS headphones. The acoustic stimuli were presented to each child (with binaural presentation), and they needed to choose between two pictures displayed on the computer screen, according to each stimulus heard.
The results of the perception experiments were analyzed and the Percentage of Correct Identification (PCI) was calculated, corresponding to the percentage of correct answers in the identification test, that is, the percentage of correct answers in the phonic contrasts in the identification task of the BP consonant phonemes. children with Speech Sound Disorders Art. 13, page 7 of 13

Statistical Analysis
For the analysis, the STATISTICA software (version 7.0) was used. The PCC-R (production performance) and PCI (perceptual performance) scores were compared and correlated through a paired t-test and a Pearson's correlation test, respectively. Correlation is a measure of the relation between two or more variables. Correlation coefficients can range from -1.00 to +1.00. The value of -1.00 represents a perfect negative correlation, while a value of +1.00 represents a perfect positive correlation. A value of 0.00 represents a lack of correlation. The PCC-R and PCI values were considered dependent variables in these tests. P values were deemed to be significant if they were lower than .05. For the multiple correlations, Bonferroni's correction control for Type I error was performed. Table 1 shows the descriptive and inferential statistical results for the speech production performance (measured by PCC-R), as well as the perceptual performance (PCI -measured by the percentage of correct identification).

Results
The mean PCC-R of the children with SSD was 74.97% (SD 18.15), while the mean PCI (percentage of correct identification) was 87.41% (SD 9.51). Pearson's correlation test was statistically significant, showing a moderate positive correlation strength (r = 0.49) between the performances (see Figure 1).  When comparing the production and perception performances, the paired-sample t-test showed that the mean speech perception performance (M = 87.41%, SD = 9.51) was significantly higher than the mean speech production performance (M = 74.97%, SD 18.15). Figure 2, below, illustrates that. Table 2 shows the statistical results for the percentage of errors in speech production and speech perception according to phonological classes.
Correlation coefficients were computed between the percentage of errors in the speech production and speech perception considering the three phonological classes. Using the Bonferroni control approach for Type I error across the three correlations, a p-value of less than 0.01 (0.05/3 = 0.01) was required for significance. The results of the correlation analyses show that, only in the fricative class, the correlation was statistically significant with moderate strength (r = 0.52), such as illustrated in Figure 3.
Summarizing the results presented, there is a moderate positive correlation strength (r = 0.49) between speech production and speech perception performances. The mean speech perception performance was significantly higher than the mean speech production performance. However, in the correlation analyses of the errors, only the fricative class shows a statistically significant correlation.

Discussion
The purpose of the present study was to investigate the relation between speech production and perception in children with SSD. Considering the controversy in the literature supporting the evidence regarding the relationship between speech production and speech perception in children with SSD, three hypotheses were formulated, assuming that children's perception of speech sounds is a critical variable influencing the way these sounds are produced: H1: a positive correlation between the production and perception data for children with SSD; H2: a difference between overall accuracy of speech production and speech perception in the children with SSD; and H3: a positive correlation between speech production and speech perception errors could depend on the phonological class. The first hypothesis was confirmed, as the Pearson's correlation test showed a moderate positive correlation strength (r = 0.49) between speech production and speech perception performances. This result corroborates previous studies that point to a relationship between perception and production (Rvachew & Jamieson, 1989;Munson et al., 2005;Nijland, 2009;Cabbage, Hogan & Carrell, 2016;Hearnshaw et al., 2018).
The presence of a positive correlation between speech production and speech perception is explained by the researchers cited above, considering the underlying representation of the contrasts. That is, if a child has not established the underlying representation for a determined phonological contrast, this will affect both speech production and speech perception, because these skills require access to a symbolic system. Therefore, the presence of a representation deficit, which is the case of children with SSD, causes or contributes to speech production and speech perception deficits.
Additionally, we can consider Hearnshaw et al. (2018)'s argument regarding the explanation for a significant positive correlation between overall speech production and perception accuracy found in their study. According to them, because children with SSD have poorer perceptual accuracy than children with typically developing speech (i.e., adequate speech production skills), children with SSD may more probably have poorer perceptual representations. The second hypothesis, regarding the difference between speech production and speech perception performances by children with SSD, was also confirmed. The results showed a higher perceptual accuracy when compared to their production accuracy.
Three reasons may explain these findings. The first possible one is related to the motor performance of children with SSD. Research on motor speech performance with children with SSD has reported the presence of abnormal movement patterns, interpreted as a suggestion for motor differences (Gibbon, 1999;Gibbon & Wood, 2002;Goozée et al., 2007;Berti, de Boer & Bressmann, 2016). Although we have not investigated the children's motor performance in this study, they may present some degree of motor difficulty, justifying their poorer speech production compared to their perceptual performance.
The second reason is related to development. As might be expected, speech perception precedes speech production. In general, in the language acquisition process, children are firstly capable to discriminate and identify a phonological contrast perceptually and afterwards to produce this contrast. Rvachew & Jamieson (1989), for instance, highlighted the fact that children with speech perception difficulties were found not to have normal production skills, suggesting that perception precedes production.
The third possible reason is related to speech perception phenomena. Speech perception is an auditory-visual event, since it involves the integration of auditory and visual cues into a unitary phonological entity (Dodd, Mcintosh, Erdener, & Burnham, 2008). Consequently, considering that the perceptual dimension encompasses other aspects (such as visual cues and/or semantic information), the children with SSD who participated in this study may have used additional information for aid in the phonic contrast identification task (Burnhan, Tyler & Horlyck, 2002).
We also need to consider, according to Table 1, the performance variability in speech production among children with SSD, based on the standard deviation values we found. The standard deviations for speech production were higher than the standard deviation for speech perception. This finding could reflecti the heterogeneity of children with SSD. Some studies (Dodd et al., 2008;Nagao et al., 2012;Hearnshaw et al., 2018) highlight the importance of considering theSSD subtypes in speech production and speech perception tasks as well.
Finally, the third hypothesis, regarding the possibility that a positive correlation between speech production and speech perception errors depends on phonic class, was also confirmed. In the correlation analysis between the speech production and speech perception errors, we found that only in the fricative class, the correlation was statistically significant with moderate strength (r = 0.52).
Regarding the presence of a positive correlation between speech production and speech perception errors in children with SSD, Edwards (1974) claimed that the relationship between speech production errors and speech perception ability may not exist for all phonemic contrasts, because the role of auditory perception in the development of articulation skills may vary depending on the particular phoneme being learned.
The fricative class, in BP, consists of six phonemes (/f,v,s,z,∫,ᴣ/), which could pose more difficulty for children during the acquisition process. A previous study of BP described that fricative class caused the highest incidence of acquisition problems for children with SSD (Patha & Takiuchi, 2008). Berti (2017b) verified that the children's auditory perceptual accuracy was dependent on the phonemic class, with lower perceptual accuracy for the fricative class. This author reinforces the important interaction between acoustic features and anatomical-physiological features of the human ear when explaining these results.
From an acoustic perspective, the fricatives are usually present with aperiodic energy distributed in the frequency spectrum according to the length of the front cavity resulting from production. More specifically, the shorter the front cavity length in the fricatives, the higher will be the resonance frequencies (Kent & Read, 1992). In terms of human ear sensibility, the neural signal is not in a one-to-one relationship with the loudness of frequencies over 5,000 Hz and are less salient than lower frequencies in the presence of background noise (Johnson, 1997).
Associated with these explanatory possibilities, there is the fact that children with SSD, being at scholar age, could present recurrent otitis media. Because recurrent otitis media is associated with fluctuations in hearing sensitivity during the preschool years, thatmight interfere with learning to identify the acoustic cues that are critical for perceiving fricative contrasts. Another potential factor is environmental noise, because high noise levels at home and the school environment impact negatively the perceptual performance of children.
This study has had some limitations. The group of participants was heterogeneous, presenting a large diversity of speech production errors and different phonological disorder severity degrees. In future studies, these aspects need to be considered.

Conclusion
The results showed that the mean speech perception performance was significantly higher than the mean speech production performance, there is a positive correlation between speech production and speech perception performances in children with SSD, as well as there is a positive correlation between speech production and speech perception errors which depends on the phonological class. Fricatives show a strong relationship between production and perception.
It is important to highlight the necessity to deepen the investigation about the relationship between speech production and speech perception with other languages since most studies are done in majority English speaking contexts. Therefeore, the findings of such studies may not transfer to non-majority English speaking contexts. The results confirmed the relationship between speech production and speech perception. Speech perception seems to be a critical variable influencing the way children with SSD produce these sounds. However, speech perception does not mirror speech production; that is, there is no univocal relationship between speech production and speech perception.

Funding Information
To the FAPESP -Fundação de Amparo à Pesquisa do Estado de São Paulo (grant number 2016/08775-0) and to the CNPq -Conselho Nacional de Desenvolvimento Científico e Tecnológico (grants number 303439/2016-5; 429025/2018-1) for the granted funding to carry out the research whose results were reported in the present article. This study was also financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior -Brasil (CAPES) -Finance Code 001.