Elsevier

Hearing Research

Volume 271, Issues 1–2, January 2011, Pages 103-114
Hearing Research

Research paper
Enhanced physiologic discriminability of stop consonants with prolonged formant transitions in awake monkeys based on the tonotopic organization of primary auditory cortex

https://doi.org/10.1016/j.heares.2010.04.008Get rights and content

Abstract

Many children with specific language impairment (SLI) have difficulty in perceiving stop consonant-vowel syllables (e.g., /ba/, /ga/, /da/) with rapid formant transitions, but perform normally when formant transitions are extended in time. This influential observation has helped lead to the development of the auditory temporal processing hypothesis, which posits that SLI is causally related to the processing of rapidly changing sounds in aberrantly expanded windows of temporal integration. We tested a potential physiological basis for this observation by examining whether syllables varying in their consonant place of articulation (POA) with prolonged formant transitions would evoke better differentiated patterns of activation along the tonotopic axis of A1 in awake monkeys when compared to syllables with short formant transitions, especially for more prolonged windows of temporal integration. Amplitudes of multi-unit activity evoked by /ba/, /ga/, and /da/ were ranked according to predictions based on responses to tones centered at the spectral maxima of frication at syllable onset. Population responses representing consonant POA were predicted by the tone responses. Predictions were stronger for syllables with prolonged formant transitions, especially for longer windows of temporal integration. Relevance of findings to normal perception and that occurring in SLI are discussed.

Introduction

In one of the most influential and controversial observations in phonology, Tallal et al. have reported that many children with specific language impairment (SLI) had difficulty in the perception of stop consonant-vowel (CV) syllables (e.g., /ba/, /ga/, /da/) with rapid formant transitions (e.g., 40 ms), but performed similarly to controls when formant transitions were extended in time (e.g., 80 ms) (for reviews see Tallal et al., 1993, Tallal, 2004). This fundamental observation has been replicated (e.g., Elliott et al., 1989, Kraus et al., 1996, Stark and Heinz, 1996; for review see Burlingame et al., 2005) and expanded to include both subjects with dyslexia and with deficits in other, non-linguistic, auditory perceptions that involve processing of rapidly changing sound stimuli (e.g., Hari and Kiesilä, 1996, Wright et al., 1997, Helenius et al., 1999, Laasonen et al., 2000, King et al., 2003). Ultimately, this evolving and frequently contradictory body of evidence led to the formulation of the auditory temporal processing hypothesis, which posit that difficulties in processing rapidly changing sound stimuli is causally related to SLI (Tallal, 2004). In its latest form, individuals with SLI exhibit deficiencies in phonological functions because they aberrantly process rapidly changing sound stimuli in expanded temporal windows of integration, “chunking” acoustic events together in time periods too large to allow the fine-grained patterns of speech to be readily discriminated. This hypothesis has become so influential that it has spawned commercially available and clinically successful remediation programs (Merzenich et al., 1996, Tallal et al., 1996, Tallal et al., 1998, Tallal, 2004; see also Kujala et al., 2001, Stevens et al., 2008).

While the temporal processing hypothesis remains highly contentious (e.g., Studdert-Kennedy and Mody, 1995, Bishop et al., 1999, Goswami, 2003), temporal or other auditory processing deficits are likely causal in a subset of people with SLI (Heath et al., 1999, Habib, 2000, Ramus et al., 2003). Strong evidence that temporal processing deficits are directly relevant to SLI was provided by Benasich and Tallal (2002), who found that performance in 6–9 months infants on a rapid auditory discrimination task was predictive of later language impairments at 2 and 3 years of age. While not a demonstration of causality, this result does suggest that auditory processing plays an important role in the maturation of speech functions. Furthermore, this finding is consistent with modern views of developmental disorders, according to which fundamental deficits in neural processing (e.g., auditory temporal processing), often produced by a genetic anomaly, may only be reliably identified at early developmental stages (Karmiloff-Smith, 1998, see also Goswami, 2003). At later developmental stages, these early deficits may be partially ameliorated by compensatory mechanisms and yet result in more persistent, and higher-order deficiencies (e.g., SLI).

What remains unresolved in the consideration of these larger issues is the question of why some children with SLI perceive stop CVs with longer formant transitions more accurately than those with short formant transitions. Simply stating that the cause is a deficit in rapid temporal processing does not help to identify potential neural substrates of SLI. This issue is extremely important, as illuminating neural mechanisms potentially relevant for the improvement in phonetic discrimination might provide valuable insights into the competing hypotheses regarding the etiology of SLI and encourage modifications of remediation strategies. Thus, in order to more fully examine this question at a physiological level, data and hypotheses that directly address the differential perception of stop CV syllables must be incorporated into any explanatory framework for SLI.

Almost as contentious as the temporal processing hypothesis of SLI are the competing ideas regarding the perception of stop consonants. At one extreme, their perception is thought to be accomplished through a specialized, speech-specific module that reacquires the intended phonetic gestures of the speaker (e.g., Liberman et al., 1967, Liberman and Mattingly, 1985). Thus, speech perception is intimately tied to speech production. At the other extreme, general principles of auditory processing are thought to be sufficient to map the acoustic signal onto discrete phonetic categories (e.g., Lotto and Kluender, 1998, Stevens, 1980, Kuhl, 2004). This hypothesis is further supported by evidence that humans and non-human animals share many similarities in speech perception (e.g., Kuhl, 1986, Lotto et al., 1997, Ramus et al., 2000, Sinnott and Gilmore, 2004). These similarities justify the use of non-human animal models to examine some of the fundamental questions concerning physiological mechanisms of speech perception.

One of the most prominent acoustically-based hypotheses put forth to explain perceptual differences in stop consonants argue that the short-term acoustic spectrum within the first 20 ms of consonant onset (release) is a major determinant of their differential perception (Stevens and Blumstein, 1978, Blumstein and Stevens, 1979, Blumstein and Stevens, 1980, Chang and Blumstein, 1981). According to this hypothesis, onset spectra that are diffuse and display maxima at higher frequencies (diffuse rising) promote the perception of /d/, those that are diffuse and display maxima at lower frequencies (diffuse falling) promote the perception of /b/, while more compact spectra with maxima at mid-frequencies promote the perception of /g/. While modifications of this basic scheme have been made that incorporate the onset spectra at consonant onset within the context of later acoustic spectra occurring during the following vowel (e.g., Lahiri et al., 1984), the preceding explanation works well when onset spectra are not ambiguous (Alexander and Kluender, 2008). Furthermore, onset spectra are sufficient for stop consonant discrimination in neonates, children, and adults when stimuli are exceedingly short and lack prolonged vowels (<50 ms), or when formant transitions are omitted from the synthesized syllables (Bertoncini et al., 1987, Ohde et al., 1995, Ohde and Halay, 1997, Ohde and Abou-Khalil, 2001). Overall, these results support the importance of onset spectra in the perception of stop consonants varying in their place of articulation (POA).

An attractive feature of this perceptual hypothesis is that it provides a physiologically plausible means by which stop consonants can be differentially represented at early stages of auditory cortical processing. Multiple studies have shown that complex vocalizations are differentially represented in primary auditory cortex (A1) of experimental animals based on the underlying tonotopic organization and the spectral content of the sounds (e.g., Creitzfeldt et al., 1980, Wang et al., 1995, Steinschneider et al., 2003, Mesgarani et al., 2008). This relationship between stop consonant spectra and tonotopic patterns of activity in A1 implies that activation by /b/ should be maximal in low best frequency (BF) regions of A1, while /g/ and /d/ should maximally excite mid- and high-BF regions of A1, respectively. Physiological results in A1 support this prediction when examining onset responses evoked by stop CV syllables (Steinschneider et al., 1995, Engineer et al., 2008).

Integrating the perceptual hypothesis of Stevens and Blumstein (1978) with both the physiological characteristics of A1 noted above and the temporal processing hypothesis of SLI leads to the two predictions that are the focus of this report. First, syllables with prolonged formant transitions should evoke better-differentiated patterns of activation along the A1 tonotopic axis when compared to syllables with short formant transitions. This prediction is based on the premise that one consequence of lengthening formant transitions is to prolong the duration of spectral differences between the consonants prior to their convergence to common steady-state values. Second, the enhanced response differentiation for prolonged formant transition syllables relative to syllables with rapid formant transitions should be most evident when neural activity is analyzed within longer windows of temporal integration. This work expands our previous examination of POA representation in monkey A1, which only examined neural responses within the first 35 ms after stimulus onset (short temporal integration window) and only to syllables with 40 ms formant transitions (Steinschneider et al., 1995).

Section snippets

Materials and methods

Four male macaque monkeys (Macaca fascicularis), weighing between 2.6 and 3.9 kg at the time of surgery, were studied following approval by the Animal Care and Use Committee of Albert Einstein College of Medicine. Experiments were conducted in accordance with institutional and federal guidelines governing the use of primates, who were housed in our AAALAC-accredited Animal Institute and were under the continuous monitoring and care of the Animal Institute veterinarians. Other protocols were

Results

Results are based on middle laminae MUA obtained from 47 A1 recording sites whose BFs were within the speech range (<4 kHz). Technical issues limited the responses evoked by the 60 ms and 40 ms formant transition syllables to 46 and 45 recording sites, respectively. Thirteen sites had BFs < 1.0 kHz, 16 had BFs between 1.0 and 1.9 kHz, 9 had BFs between 2.0 and 2.9 kHz, and 9 had BFs between 3.0 and 4.0 kHz.

To ensure that there was no systematic modulation of the cortical activity based solely

Summary of data

This report identifies two key features of the neural representation of stop consonants in monkey A1: 1) Spectral characteristics of the stop CV syllables /ba/, /ga/, and /da/ occurring at stimulus onset are sufficient to produce differential patterns of neural activity in A1 that are based on the underlying tonotopic organization. 2) A1 displays enhanced discrimination of stop CV syllables with prolonged formant transitions relative to those with short formant transitions when more prolonged

Acknowledgements

The authors thank Drs. Charles E. Schroeder and David H. Reser, and Ms. Jeannie Hutagalung for their invaluable technical assistance. Supported by National Institute of Deafness and Other Communications Disorders Grant DC-00657.

References (89)

  • M.M. Merzenich et al.

    Representation of the cochlear partition on the superior temporal plane of the Macaque monkey

    Brain Res.

    (1973)
  • T. Moisescu-Yiflach et al.

    Auditory event related potentials and source current density estimation in phonologic/auditory dyslexics

    Clin. Neurophysiol.

    (2005)
  • P. Müller-Preuss et al.

    Functional anatomy of the inferior colliculus and the auditory cortex: current source density analyses of click-evoked potentials

    Hear. Res.

    (1984)
  • I. Nelken et al.

    Population responses to multifrequency sounds in the cat auditory cortex: one- and two-parameter families of sounds

    Hear. Res.

    (1994)
  • C.I. Petkov et al.

    Auditory perceptual grouping and attention in dyslexia

    Cogn. Brain Res.

    (2005)
  • M. Steinschneider et al.

    Tonotopic organization of responses reflecting stop consonant place of articulation in primary auditory cortex (A1) of the monkey

    Brain Res.

    (1995)
  • M. Steinschneider et al.

    Cellular generators of the cortical auditory evoked potential initial component

    Electroenceph. Clin. Neurophysiol.

    (1992)
  • C. Stevens et al.

    Neural mechanisms of selective auditory attention are enhanced by computerized training: electrophysiological evidence from language-impaired and typically developing children

    Brain Res.

    (2008)
  • C. Stevens et al.

    Neurophysiological evidence for selective auditory attention deficits in children with specific language impairment

    Brain Res.

    (2006)
  • H. Supèr et al.

    Chronic multiunit recordings in behaving animals: advantages and limitations

  • A. Adlard et al.

    Speech perception in children with specific reading difficulties (dyslexia)

    Quart. J. Exp. Psychol.

    (1998)
  • J.M. Alexander et al.

    Spectral tilt change in stop consonant perception

    J. Acoust. Soc. Am.

    (2008)
  • J.M. Alexander et al.

    Spectral tilt change in stop consonant perception by listeners with hearing impairment

    J. Speech Lang. Hear. Res.

    (2009)
  • K. Banai et al.

    Poor frequency discrimination probes dyslexics with particularly impaired working memory

    Audiol. Neurootol.

    (2004)
  • J. Bertoncini et al.

    Discrimination in neonates of very short CVs

    J. Acoust. Soc. Am.

    (1987)
  • J.R. Binder et al.

    Human temporal lobe activation by speech and non-speech sounds

    Cerebral Cortex

    (2000)
  • D.V.M. Bishop et al.

    Auditory temporal processing impairment: neither necessary nor sufficient for causing language impairment in children

    J. Speech Lang. Hear. Res.

    (1999)
  • S.E. Blumstein et al.

    Acoustic invariance in speech production: evidence from measurements of the spectral characteristics of stop consonants

    J. Acoust. Soc. Am.

    (1979)
  • S.E. Blumstein et al.

    Perceptual invariance and onset spectra for stop consonant vowel environments

    J. Acoust. Soc. Am.

    (1980)
  • A. Boemio et al.

    Hierarchical and asymmetrical temporal sensitivity in human auditory cortices

    Nature Neurosci.

    (2005)
  • E. Burlingame et al.

    An investigation of speech perception in children with specific language impairment on a continuum of formant transition duration

    J. Speech Lang. Hear. Res.

    (2005)
  • S. Chang et al.

    The role of onsets in perception of stop consonant place of articulation: effects of spectral and temporal discontinuity

    J. Acoust. Soc. Am.

    (1981)
  • K. Corriveau et al.

    Basic auditory processing skills and specific language impairment: a new look at an old hypothesis

    J. Speech Lang. Hear. Res.

    (2007)
  • O. Creutzfeldt et al.

    Thalamocortical transformation of responses to complex auditory stimuli

    Exp. Brain Res.

    (1980)
  • S.J. Cruikshank et al.

    Auditory thalamocortical synaptic transmission in vitro

    J. Neurophysiol.

    (2002)
  • L.L. Elliott et al.

    Fine-grained auditory discrimination in normal children and children with language-learning problems

    J. Speech Lang. Hear. Res.

    (1989)
  • C.T. Engineer et al.

    Cortical activity patterns predict speech discrimination ability

    Nature Neurosci.

    (2008)
  • J.A. Freeman et al.

    Experimental optimization of current source density techniques for anuran cerebellum

    J. Neurophysiol.

    (1975)
  • J.B. Fritz et al.

    Adaptive changes in cortical receptive fields induced by attention to complex sounds

    J. Neurophysiol.

    (2007)
  • K. Giraud et al.

    Auditory evoked potential patterns to voiced and voiceless speech sounds in adult developmental dyslexics with persistent deficits

    Cerebral Cortex

    (2005)
  • M. Habib

    The neurological basis of developmental dyslexia: an overview and working hypothesis

    Brain

    (2000)
  • S.M. Heath et al.

    Auditory temporal processing in disabled readers with and without oral language delay

    J. Child Psychiatr.

    (1999)
  • M.S. Hedrick et al.

    Perceptual weighing of stop consonant cues by normal and impaired listeners in reverberation versus noise

    J. Speech Lang. Hear. Res.

    (2007)
  • P. Helenius et al.

    Auditory stream segregation in dyslexic adults

    Brain

    (1999)
  • Cited by (13)

    • Neural representation of vowel formants in tonotopic auditory cortex

      2018, NeuroImage
      Citation Excerpt :

      Electrophysiological studies have shown that population responses to vowels in neurons defined by their best frequencies at least coarsely reflect the spectra of distinct vowels in ferrets (Versnel and Shamma, 1998; Mesgarani et al., 2008; Walker et al., 2011), cats (Qin et al., 2008) and rats (Honey and Schnupp, 2015). Similarly, animal vocalizations (Wang et al., 1995; Qin et al., 2008) and the formant transitions that cue consonant place of articulation (Steinschneider et al., 1995; Engineer et al., 2008; Steinschneider and Fishman, 2011) are also represented in tonotopic auditory cortex according to their spectral content. However, encoding of vowel formant frequencies in primary auditory cortex is not always straightforwardly predictable from neural responses to simpler sounds.

    • Anatomic organization of the auditory cortex

      2015, Handbook of Clinical Neurology
      Citation Excerpt :

      Areas on the posterior STG, corresponding in general to area 22, have been the least studied, but show consistent responsiveness to auditory stimulation, as well as regional variations in response to various classes of stimuli. Data from laboratories equipped to obtain surface recordings from the human brain have revealed robust auditory activity to a wide range of simple and complex stimuli, including evidence of functional topography for sound class, suggesting that division of the region into multiple fields is in order (Howard et al., 2000; Edwards et al., 2005; Chang et al., 2010; Leaver and Rauschecker, 2010; Besle et al., 2011; Steinschneider and Fishman, 2011; Nourski et al., 2013, 2014; Pasley et al., 2012; Greenlee et al., 2013). Like much of the functional imaging studies, these datasets are running far ahead of anatomic surveys of the posterior STG, where distinct divisions have not yet been identified.

    • Representation of speech in human auditory cortex: Is it special?

      2013, Hearing Research
      Citation Excerpt :

      Syllables varying along their POA were constructed on the parallel branch of a KLSYN88a speech synthesizer, contained 4 formants, and were also 175 ms in duration. Details can be found in Steinschneider and Fishman (2011). Macaque vocalizations were kindly provided by Dr. Yale Cohen.

    • Electrophysiological evidence for attenuated auditory recovery cycles in children with specific language impairment

      2012, Brain Research
      Citation Excerpt :

      These data suggest that one aspect of auditory dysfunction in SLI may relate to decreased responsiveness to stimuli presented at rapid rates, with auditory information integrated over a larger than normal time window. Indeed, increased temporal integration windows have been shown in animal models to impair the differentiated response apparent in A1 to consonant–vowel syllables whose identification is dependent on rapid formant transitions (Steinschneider and Fishman, 2011). However, most studies of children with SLI have not analyzed whether atypical neural responses normalize at longer inter-stimulus intervals, which would be predicted by the auditory temporal deficit hypothesis.

    View all citing articles on Scopus

    The authors declare no competing financial interests.

    View full text