Auditory neural encoding of speech in adults with persistent developmental stuttering

The pathogenesis of stuttering has not been well established, although two models have been proposed: neurogenic and developmental. Neurogenic or acquired stuttering occurs after a definable brain damage (e.g. stroke, intracerebral hemorrhage, or head trauma) [3]. The most common form of stuttering is developmental stuttering. It usually evolves in about 5% of children between 2 and 6 years of age. About 20% of those children continue to stutter during adulthood and the problem is then referred to as persistent developmental stuttering (PSD) [4]. Many factors underlie developmental stuttering, mainly defective auditory processing [5].


Introduction
Stuttering is a speech disorder with abnormal frequency or duration of interruptions in verbal fluency, manifesting as repetitions, prolongations, hard attack, or blocks that interfere with efficient speech production [1]. Intraphonemic disruptions (IPDs) are supposed to be a remarkable behavior of stuttering [2].
The pathogenesis of stuttering has not been well established, although two models have been proposed: neurogenic and developmental. Neurogenic or acquired stuttering occurs after a definable brain damage (e.g. stroke, intracerebral hemorrhage, or head trauma) [3]. The most common form of stuttering is developmental stuttering. It usually evolves in about 5% of children between 2 and 6 years of age. About 20% of those children continue to stutter during adulthood and the problem is then referred to as persistent developmental stuttering (PSD) [4]. Many factors underlie developmental stuttering, mainly defective auditory processing [5].
Although stuttering manifests as an articulatory deficit, speech perception is the modality that serves to control proper articulation. Thus, defective auditory processing and feedback may play an important role in stuttering. Data explaining the role of brainstem and cortical auditory processing of speech sounds in developmental stuttering are deficient. Processing of acoustic stimuli at different levels of the auditory neural pathway could be examined in normal and clinical populations using an objective, noninvasive, and reliable tool -namely, auditory-evoked potentials. The auditory brainstem response (ABR) and mismatch negativity (MMN) are auditoryevoked potentials that are widely used as convenient measures of brainstem and cortical auditory functions, respectively [6].
The ABR represents synchronized neural response to brief acoustic signals from a large number of neurons through the auditory nerve and brainstem [7]. ABRs can be evoked using click, as well as more complex signals such as speech sounds, which are relatively more informative with respect to psychological and linguistic aspects [8]. Accordingly, speech-evoked auditory brainstem response (sABR) recordings may have more diagnostic and prognostic implications to help identify patients with speech and language Auditory neural encoding of speech in adults with persistent developmental stuttering Ola A. Ibraheem a , Amal S. Quriba b

Background
Stuttering is a speech disorder with frequent and protracted prolongations, repetitions, and silent blocks that hamper proper speech production. It develops during the preschool years with a prevalence of 5%, decreasing to 1% in adulthood to be referred to as persistent developmental stuttering. Auditory processing deficit is proposed to be one of the contributing factors to developmental stuttering.

Objective
This study aimed to determine the pattern of auditory processing affection if any in stuttering disorder. This might be helpful for improving management approaches in the future.

Patients and methods
Eleven adults with persistent developmental stuttering and 11 comparative age-matched normally fluent participants were examined with auditory brainstem response (ABR) and mismatch negativity to evaluate the brainstem and cortical processing of speech syllables, respectively.

Results
All participants exhibited normal brainstem processing of nonspeech (click) stimuli, whereas 72.7% of stutterers revealed prolongation of peak latency of all waves of speech-evoked ABR. An additional peak latency delay of mismatch negativity response was found in 81.8% of stutterers.
average hearing threshold not exceeding 20 dB HL in the frequency range of 250-8000 Hz, normal middle ear function, no neurological disorders, and average intelligent quotient (IQ). The mean IQ ± SD of the control group was 90 ± 11 and that of the study group was 91 ± 9, based on psychometric evaluation. None of the participants of the study group suffered from other speech, language, or voice disorders. All participants gave their written informed consent before participation in the study. The institutional review board approval for this work was obtained on 15 May 2012.

Phoniatric assessment
A multidimensional assessment protocol was used for assessment of patients. It included the following levels.

Elementary diagnostic procedures
It included a personal interview and history taking as well as auditory perceptual assessment of speech encompassing subjective evaluation for the presence of repetitions, prolongation, blocks, and IPDs. Visual perceptual assessment was carried out for evaluation of eye contact and involuntary movement.

Clinical diagnostic aids
They included speech recording and formal tests for measuring stuttering severity using the Stuttering Severity Instrument-3 (SSI-3). The SSI-3 assessment was based on: (i) the percentage of stuttered syllables (frequency score); (ii) the average stutter duration of the three longest stutters during reading (duration score); and (iii) physical concomitant assessment (e.g. distracting sounds, facial grimaces, etc.). The total overall score was obtained by adding together the scores of the three components. Then a percentile was calculated and the results were categorized into one of five categories: very mild, mild, moderate, severe, or very severe [18]. Psychometric testing using the Stanford Binet test [19] and the Comprehensive Arabic language test [20] were also used as formal tests.

Additional instrumental measures
They included acoustic analysis of vowel /a/ at a comfortable level. This was done using the vocal assessment program from Tiger DRS4 (Tiger DRS, Inc, Seattle, Washington, USA).

Audiological examination
ABRs can be evoked using click (cABR), as well as more complex signals such as speech sounds, which are relatively more informative with respect disorders. Speech syllables with brief duration, such as da, ba, and ga, are usually used to elicit ABR because of their time-varying property, in particular stop consonants, rendering them perceptually vulnerable in clinical populations such as those with learning disability [9], autism [10], and attention deficit hyperactivity disorder [11].
MMN is a cortical response generated bilaterally in the supratemporal part of the auditory cortex and in the inferior frontal cortices [12]. It is commonly elicited in an odd-ball paradigm in which a standard stimulus is paired with a deviant stimulus, the standard being presented in the majority of instances. It represents preattentive detection of a change from the active sensory memory trace of the standard stimulation [13]. Consequently, MMN can be used as a useful tool for investigating the different aspects of cognition and cortical auditory processing.
Self-generated speech sounds generate auditory feedback that integrates upcoming motor commands, which is important for the stability and control of speech production. Alteration of the timing of auditory feedback, specifically delayed auditory feedback (DAF), in individuals with fluent speech induces a variety of articulation disturbances [14]. Conversely, DAF has been shown to enhance speech fluency in some stutterers [15]. The degree of fluency enhancement varies depending on a number of variables including; duration of delay, feedback intensity, the context and the individual [16]. Consequently, the clinical effectiveness of altered auditory feedback as a treatment tool remains controversial [17]. Determining the underlying defect in stuttering speech abnormality may help improve management strategies. Accordingly, this study was designed to investigate the auditory perception in adults with PSD.

Patients
Participants in this study were divided into two groups: (i) the study group included 11 men (20.2 ± 1.87 years) with persistent developmental stuttering (PDS) and (ii) the control group consisted of 11 age-matched (20.9 ± 2.28 years; P = 0.463), sex-matched, and education-matched normally fluent speakers with no history of speech and language impairments. Dysfluent and normally fluent participants were recruited from Phoniatrics and Audiology Units, ENT Department, Faculty of Medicine, Zagazig University, between August 2012 and February 2013.
The choice of the adulthood period was based on the observation that the disorder became persistent in this developmental stage. All participants had an The absolute latency of the positive waves I, III, and V and the interpeak latencies I-III, III-V, and I-V were identified and measured to analyze the cABR. The sABR is formed of transient, transitional, and sustained portions. The onset response of the transient portion has been analyzed for wave V and A latencies and V-A complex measures [interpeak amplitude, duration, and slope (interpeak amplitude/duration)], whereas the offset response has been analyzed for wave O latency and amplitude. The sustained portion represents the frequency following response. It was identified as negative troughs (D, E, and F) occurring every 10 ms and measured for their latency and amplitude. Wave C is a transitional negative wave between the two portions of the sABR. Its amplitude and latency were also measured [22].

Mismatch negativity response
Speech stimuli were /wa/ as the standard and /ba/ as a deviant and were presented in an odd-ball paradigm with a probability of 80% for the standard and 20% for the deviant stimuli. The two syllables differed acoustically in terms of the duration of the initial formant transitions [23]. A total of 250 stimuli were presented at alternating polarity, at an intensity of 80 dB nHL and a rate of 1.1/s. Fifty sweeps from each ear were averaged apart. The analysis period was 500 ms with 50 ms prestimulus recording. Recordings were made with an amplification of 50, artifact rejection level of 100 μV, and band-pass filter of 0.8-30 Hz.
The MMN wave was identified visually as a relative negativity in the difference waveform with a latency range of 100-300 ms following stimulus onset [24]. It was computed by subtracting the standard wave from the deviant one. Analysis of the MMN wave included peak latency, peak amplitude, duration (offset latency− onset latency), and area (amplitude×duration).

Statistical analyses
Data from the right and left ears of all participants were collected and tabulated in raw data tables. They were statistically analyzed using the SPSS software statistical computer package version 20 (SPSS Inc., Chicago, Illinois, USA). Simple descriptive analysis was performed to calculate the mean ± SD of the test variables. The data were found to be homogenous on the variance homogeneity test. Hence, parametric tests were applied. The mean values of the control versus study groups' acoustic analysis results of vowel /a/ and the ABR and MMN measures in the right and left ears separately were compared using the independent sample t-test to calculate the t-value and its P. Pearson's correlation coefficient was used to test the presence of to psychological and linguistic aspects [8]. cABR, sABR, and MMN were examined in all participants using an auditory-evoked potential audiometer (model Smart EP, version 2.39, Intelligent Hearing Systems, Miami, Florida, USA). The test stimuli were presented monaurally to both ears through TDH-39 headphones. All recordings were made using silver-silver chloride scalp electrodes. The electrode sites were cleaned with alcohol and scrubbed with abrasive paste to keep the electrode impedance below 3 kΩ. The electrodes were placed at the frontal midline (Fz) (active), at the ipsilateral mastoid (reference), and at the contralateral mastoid (ground). All participants were tested while sitting comfortably on a bed and in a sound-isolated room. They were instructed to ignore the test stimuli and their attention was distracted by a very low-volume (<40 dB SPL) film displayed on a computer.

Stimuli presentation and recording Auditory brainstem response
Rarefaction acoustic click with a duration of 100 μs was used to evoke the click-evoked auditory brainstem response (cABR). It was presented at an intensity of 90 dB nHL and at a rate of 19.3/s. A total of 1024 sweeps were obtained from the stimulated ear. Recordings were made with a band-pass filter of 100-1500 Hz in a time window of 10 ms.
A pilot study was conducted on five normally fluent participants and five stutterers using different speech stimuli (ba, da, and ga) to elicit the sABR. The / ba/ stimulus was chosen in this study because it has provided the sABR with better morphology and higher amplitude than the /da/ and /ga/ stimuli. The sABR to /ba/ stimulus in the pilot was added to the study data. The stimulus duration was 114.875 ms, which is the default of the Smart EP. The /ba/ stimulus is characterized by having voicing onset at 10 ms and F0 (100 Hz). The formant transition duration is 50 ms and includes linearly rising and flat portions. The linearly rising portion comprises F1 (400-720 Hz), F2 (900-1240 Hz), and F3 (2400-2500 Hz), whereas the flat one includes F4 (3300 Hz), F5 (3750 Hz), and F6 (4900 Hz). Ten milliseconds of initial frication are centered at frequencies of around F4 and F5 [21].
The /ba/ stimulus was presented in an alternating polarity at a rate of 8.42/s with an interstimulus interval of 3.83 ms and intensity of 70 dB nHL. One thousand sweeps were collected from the right and left ears separately using a filter of 30-3000 Hz and digitized at 20 kHz. An artifact criterion of ± 35 μV was applied to reject epochs that contained myogenic artifacts. Data were plotted in a time window of 10 ms before stimulus onset to 70 ms after stimulus onset. correlation between severity of stuttering and auditory processing abnormalities. For all tests, statistical significance was set at P value less than 0.05.

Results of phoniatric assessment
Auditory perceptual assessment of the 11 patients revealed that all of them (100%) had IPDs, eight (72.7%) experienced repetition of syllables and words, three (27.3%) had prolongations, and three (27.3%) had blocks. Visual perceptual assessment revealed that all patients (100%) had good to fair eye contact and seven (63.64%) had involuntary movement in either the face, neck, or extremities.
Assessment of SSI-3 revealed that none of the patients had very mild stuttering, one (9.09%) had mild stuttering, two (18.18%) had moderate stuttering, five (45.45%) had severe stuttering, and three (27.27%) had very severe stuttering. Results of the acoustic analysis in the control and study groups are shown in Table 1. Comparison of acoustic analysis between the two groups revealed significant difference in Jitter percentage ( Jitt%), shimmer percentage (Shim%), and harmonic to noise ratio, whereas there was nonsignificant difference in average fundamental frequency (F0).

Results of audiological examination Auditory brainstem response
Statistical analysis performed on the absolute and interpeak latency values of cABR demonstrated no statistically significant difference between the control and study groups in any of these measures ( Table 2).
The sABR waves were identifiable at all times (100%) in both groups. As shown in Table 3, the independent sample t-test revealed statistically significant delay in the peak latency of the transient portion's waves (V, A, and O). However, the mean value of the V-A complex measures and the peak amplitude of wave O of the normal and dysfluent participants showed no significant difference. Similarly, the transitional and sustained portions presented with statistically significant longer peak latency of all waves (C, D, E, and F) in the stuttering group with no significant difference between the peak amplitude values of the two groups (Table 4). Figure 1 shows the delayed peak latency of all sABR waves in the study group. Statistical significance in delay in the peak latency of sABR waves was recorded in eight (72.7%) stutterers, whereas the remaining (27.3%) exhibited normal peak latency.

Mismatch negativity response
Identifiable MMNs could be elicited in both groups' right and left ears for the speech odd-ball paradigm except from one right ear in the study group. The independent sample Representative example comparing the right (a) and left (b) speechevoked auditory brainstem response (sABR) of a normally fluent individual (upper two traces) with those of a stuttering individual (lower two traces). It shows the delayed peak latencies of the stutterer's sABR waves.   t-test was applied to compare the MMN parameters in both groups ( Table 5). The peak latency of MMN is statistically significantly longer in stutterers than in normally fluent participants in the right and left ears. In contrast, the peak amplitude, duration, and area of MMN did not differ significantly between groups (Fig. 2).

Distribution of brainstem timing and cortical processing deficits
An illustration of the auditory-evoked potential findings among individuals with PDS at the brainstem and cortical levels is presented in Figure 3. Abnormal MMN was revealed in nine (81.8%) stutterers: seven (63.64%) of them showed an additional sABR abnormality, whereas the other two (18.18%) had abnormality in the MMN alone. Only two (18.18%) patients had normal MMN, mismatch negativity; sABR, speech-evoked auditory brainstem response; SSI-3, stuttering severity instrument-3. *Significant difference when P = 0.05 to >0.01, **Highly significant difference when P = 0.01 to >0.001 and ***Very highly significant difference when P ≤ 0.001.

Figure 3
Distribution of brainstem timing and cortical processing deficits among individuals with PDS. The positive symbol represents abnormal peak latencies, whereas the negative symbol represents normal peak latencies. MMN, speech-evoked; sABR, speech-evoked auditory brainstem response.

Relationship between the audiological findings and the severity of stuttering
A positive correlation (P < 0.05) was found between impaired brainstem timing and cortical processing and the severity of stuttering ( Table 6).

Validity of the audiological findings
Finally, the validity of the sABR and MMN parameters that differed significantly between the two research groups was evaluated by calculating the sensitivity, specificity, and accuracy as follows: (1) Sensitivity = (patients with the disease detected by the test/total number of patients with the disease) × 100. (2) Specificity = (individuals without the disease who are negative by the test/total number of individuals without the disease) × 100. (3) Accuracy = (true positive+true negative)/total number of participants × 100. Table 7 shows the validity for the average value of the right and left measures. Higher validity was obtained for the peak latency of the MMN testing, followed by the transient portion of the sABR and then the transitional and sustained portions.

Discussion
The aim of this study was to evaluate whether neural encoding of speech features at brainstem and cortical levels is altered in adults with PDS. Confirming the diagnosis of PDS based on information from the multi-dimensional assessment protocol of stuttering. It revealed the presence of common features that are usually associated with stuttering: IPDs, repetitions of syllables and words, prolongations, and blocks. Visual perceptual assessment revealed the presence of involuntary movements in different parts of the body.
Results of SSI-3 confirmed the presence of stuttering in all patients but at different levels of severity.
Comparison between the control and study groups in the acoustic analysis revealed significant differences in Jitt%, Shim%, and harmonic to noise ratio. This indicates that the differences between the speech of stutterers and those of normal individuals appear not only at the level of syllables and words but also at the level of sounds. This becomes clear during acoustic analysis of the vowel /a/.
One of the main findings in this study is that all participants exhibited normal brainstem processing of nonspeech (click) stimuli. However, there were differences between the control and PSD groups with respect to processing of speech. The PDS individuals demonstrated reduced neural synchrony and phase locking to speech at the level of the brainstem that was reflected as significant prolongation of peak latency of all waves of sABR. This finding is consistent with that observed by Stager [25] and Angrisani et al. [26].
In these studies, stutterers as a group did not differ from normally fluent participants with respect to the outcome of cABR. In contrast, Khedr et al. [27] had found significant prolongation of absolute and interpeak latencies of cABR in their stuttering participants. Using more complex but nonspeech sounds synthesized from sine-wave stimuli to evoke frequency following response, Hampton et al. [28] reported degraded frequency change (amplitude reduction as well as poor tracking of the frequency change) in some but not all adults with PDS, suggesting a brainstem temporal processing disruption.
The varied brainstem encoding of click versus speech stimuli probably reflects the different acoustic and environmental quality of the two stimuli. Clicks are stimuli with rapid onset, brief duration, and flat broadband spectral components, whereas speech consonant-vowel syllables have rapid and low amplitude onset, longer duration, and complex timevarying spectral content. Backward masking of the consonant by the following more intense vowel is a feature of consonant-vowel syllables.
In addition, the response to click is limited to the onset, whereas the response to speech syllables includes three portions: transient, sustained, and transitional. The transient portion of the brainstem response reflects the encoding of rapid temporal changes inherent in the consonant. The sustained response encodes the harmonic and periodic sound structure of vowels activating different response mechanisms from onset responses, whereas the transitional portion represents the change from the transient to the periodic portion of the syllable [8,10]. Therefore, click and speech stimuli entail different brainstem processing demands.
Moreover, the speech-click ABR discrepancy reflects different developmental trajectories of the two stimuli, as the transcription of speech is known by its experiencedependence and is affected by auditory learning [21]. Thus, speech could provoke experience-related shaping of the brainstem neural response, unlike clicks, which are laboratory-based and synthetic. Consequently, investigating the brainstem encoding of speech sounds is necessary as responses to these stimuli can uncover auditory processing deficits [29] and experiencedependent impact on the brainstem encoding [30] that clicks alone cannot.
Another distinctive feature characterizing this study is the delayed peak latency of MMN response to speech syllable contrasts in stutterers. A different finding was reported in a study by Corbera et al. [31]. They found significant enhancement of left mastoid MMN amplitude in response to phonetic (vowel) contrasts in adults with PDS. This was attributed to the effect of auditory feedback in alleviating the stuttering behavior, in which external clues help synchronize neural activity in auditory areas related to the speech sound in play. Consequently, the altered MMN instead of being absent or reduced (as expected in clinical populations) was abnormally enlarged, suggesting an overexcited response of the auditory cortex to specific speech sounds.
The combination of abnormalities at rostral brainstem and cortical (auditory and inferior frontal cortices) levels in this study could be attributed to disruption of the functional relationships between brainstem and cortical processing. Three hypotheses could be postulated to explain this relationship: corticofugal modulations, bottom-up deficit, and absence of functional relationship [9,32].
In the corticofugal modulation, the cortical influence relates to the fine tuning of the neural processing at the level of the thalamus [33] and inferior colliculus [34] by enhancing the relevant signals and suppressing the undesired ones [35]. Auditory sensory memory and neural plasticity -which can be assessed by the MMN -are among those cortical functions. Consequently, cortical dysfunction will disrupt this neural feedback resulting in defective auditory brainstem processing. This interpretation could be supported by the fact that stutterers with abnormal brainstem timing to speech stimuli actually have normal ABR to click stimuli. Therefore, the response generators at the brainstem level may be intact. Hence, abnormal timing could be the output of abnormal cortical feedback [32].
In contrast, the bottom-up hypothesis postulates that precise brainstem timing is essential for the cortical ability to adequately process auditory stimuli. Therefore, the impaired neural timing could adversely affect the cortical processing. The abnormal rostral brainstem timing may arise from abnormal function of the neural generators of sABR, presumably the cochlear nucleus, the lateral lemniscus, and the inferior colliculus [36]. Confirmation of this suggestion may be based on the more complex nature of speech stimuli that adds more stress on the functionally impaired neural generators. Thus, the sABR is impaired in the presence of normal cABR in stutterers.
An additional assumption to the linkage between brainstem timing and cortical activity is the absence of a functional relationship. This could be related to the variability noticed in the current study in individuals with PDS in whom the pathology was restricted to either the brainstem in one participant (9.09%) or the cortex in two participants (18.18%).
The altered auditory processing and hence the feedback that modulates the articulation process may have a role in stuttering. The support for this notion comes from the positive correlation between the delay in peak latencies of sABR and MMN waves and severity of stuttering. This reflects the relationship between brainstem and cortical auditory processing abnormalities and stuttering. Moreover, Van Riper [37] noticed a cessation of stuttering in a patient a few hours after incidental traumatic hearing loss, reflecting a relationship between auditory feedback and stuttering.
To study the validity of auditory-evoked potentials (ABR and MMN) in detecting auditory sensory processing abnormalities in patients with PDS we have estimated the accuracy of these measures ( Table 7). The peak latency of MMN was found to have the higher accuracy value (83.3%), followed by the peak latency of the waves in the transient portion of the sABR (71.17%) and finally the transitional portion (63.6%) and then the sustained portion (59.05) waves' peak latency.
From these data we postulate that the ABR and MMN successfully estimate the brainstem and cortical auditory processing in patients with PDS. Abnormal brainstem timing and/or cortical dysfunction could present in a subset of patients. In such a case, brainstem and cortical auditory processing abnormalities are proposed to be the underlying deficit in this group. These results suggest the need for future research that aims to study the effect of DAF and auditory training on the auditory processing capabilities of the stuttering population and hence on their speech fluency and whether the ABR and MMN could be used as tools to monitor the efficiency of auditory training programs.