Abstract
The cyclic variation in the energy envelope of the speech signal results from the production of speech in syllables. This acoustic property is often identified as a source of information in the perception of syllable attributes, though spectral variation can also provide this information reliably. In the present study of the relative contributions of the energy and spectral envelopes in speech perception, we employed sinusoidal replicas of utterances, which permitted us to examine the roles of these acoustic properties in establishing or maintaining time-varying perceptual coherence. Three experiments were carried out to assess the independent perceptual effects of variation in sinusoidal amplitude and frequency, using sentence-length signals. In Experiment 1, we found that the fine grain of amplitude variation was not necessary for the perception of segmental and suprasegmental linguistic attributes; in Experiment 2, we found that amplitude variation was nonetheless effective in influencing syllable perception, and that in some circumstances it was crucial to segmental perception; in Experiment 3, we observed that coarse-grain amplitude variation, above all, proved to be extremely important in phonetic perception. We conclude that in perceiving sinusoidal replicas, the perceiver derives much from following the coherent pattern of frequency variation and gross signal energy, but probably derives rather little from tracking the precise details of the energy envelope. These findings encourage the view that the perceiver uses time-varying acoustic properties selectively in understanding speech.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Bailey, P J., &Summerfield, Q. (1980). Information in speech. Observations on the perception of [s]-stop clusters.Journal of Experimental Psychology. Human Perception & Performance,6, 536–563.
Best, C. T., Studdert-Kennedy, M., Manuel, S., &Rubin-Spitz, J. (1989). Discovering phonetic coherence in acoustic patterns.Perception & Psychophysics,45, 237–250.
Cutler, A., &Foss, D. J. (1977). On the role of sentence processingLanguage & Speech,20, 1–10.
Cutler, A., &Norris, D. (1988). The role of strong syllables in segmentation for lexical access.Journal of Experimental Psychology: Human Perception & Performance,14, 113–121.
Darwin, C. J., &Bethell-Fox, C E. (1977). Pitch continuity and speech source attribution.Journal of Experimental Psychology: Human Perception & Performance,3, 665–672.
Egan, J. (1948). Articulation testing methods.Laryngoscope,58, 955–991.
Fant, C. G M. (1962). Descriptive analysis of the acoustic aspects of speech.Logos,5, 3–17.
Fitch, H. L., Halwes, T., Erickson, D. M., &Liberman, A M. (1980). Perceptual equivalence of two acoustic cues for stop-consonant manner.Perception & Psychophysics,27, 343–350.
Fowler, C. A., &Smith, M. R. (1986). Speech perception as “vector analysis”. An approach to the problems of invariance and segmentation. In J S. Perkell & D. H. Klatt (Eds.),Invariance and variability in speech processes (pp. 123–138). Hillsdale, NJ: Erlbaum.
Halle, M., Hughes, G. W., &Radley, J.-P. A. (1957). Acoustic properties of stop consonants.Journal of the Acoustical Society of America,29, 107–116
Huggins, A W. F (1978). Speech timing and intelligibility. In J Requin (Ed.),Attention and Performance VII (pp 279–297). Hillsdale, NJ. Erlbaum.
Jenkins, J. J., Strange, W., &Edman, T. R. (1983). Identification of vowels in “vowelless” syllables.Perception& Psychophysics,34, 441–450.
Kewley-Port, D., &Luce, P. A. (1984). Time-varying features of initial stop consonants in auditory running spectra: A first report.Perception & Psychophysics,35, 353–360.
Kiparsky, P. (1979). Metrical structure assignment is cyclic.Linguistic Inquiry,10, 421–441.
Klatt, D. H. (1985). A shift in formant frequencies is not the same as a shift in the center of gravity of a multiformant energy concentration.Journal of the Acoustical Society of America,77, S7.
Liberman, A. M., &Cooper, F. S. (1972). In search of the acoustic cues. In A. Valdman (Ed.),Papers in linguistics and phonetics to the memory of Pierre Delattre (pp. 329–338). The Hague: Mouton.
Liberman, A. M., Cooper, F. S., Shankweiler, D. P., &Studdert-Kennedy, M. (1967). Perception of the speech code.Psychological Review,74, 421–461.
Licklider, J. C. R. (1946). Effects of amplitude distortion upon the intelligibility of speech.Journal of the Acoustical Society of America,18, 429–434.
Massaro, D. W. (1987). Categorical partition: A fuzzy logical model of categorization behavior. In S. Harnad (Ed.),Categorical perception (pp. 254–283). New York: Cambridge University Press.
Miller, G. A. (1946). Intelligibility of speech: Effects of distortion. InTransmission and reception of sounds under combat conditions (pp. 86–108). Washington, DC: National Defense Research Committee.
Nakatani, L. H., &Schaffer, J. A. (1978). Hearing “words” without words: Prosodic cues for word perception.Journal of the Acoustical Society of America,63, 234–245.
Price, P. J. (1980). Sonority and syllabicity: Acoustic correlate of perception.Phonetica,37, 327–343.
Remez, R E. (1987). Units of organization and analysis in the perception of speech. In M. E. H. Schouten (Ed.),Psychophysics of speech perception (pp. 419.432). Dordrecht: Martinus Nijhoff.
Remez, R. E., &Rubin, P. E. (1983). The stream of speech.Scandinavian Journal of Psychology,24, 63–66.
Remez, R. E., &Rubin, P. E. (1984). On the perception of intonation from sinusoidal sentences.Perception & Psychophysics,35, 429–440.
Remez, R. E., Rubin, P. E., Nygaard, L. C., &Howell, W. A. (1987). Perceptual normalization of vowels produced by sinusoidal voices.Journal of Experimental Psychology: Human Perception & Performance,13, 40–61.
Remez, R. E., Rubin, P. E., Pisoni, D. B., &Carrell, T. D. (1981). Speech perception without traditional speech cues.Science,212, 947–950.
Stevens, K. N., &Blumstein, S. E. (1981). The search for invariant acoustic correlates of phonetic features. In P. D. Eimas & J. L. Miller (Eds.),Perspectives in the study of speech (pp. 1–38). Hillsdale, NJ: Erlbaum.
Whalen, D. H. (1984). Subeategorical phonetic mismatches slow phonetic judgments.Perception & Psychophysics,35, 49–64.
Author information
Authors and Affiliations
Corresponding author
Additional information
This research was supported by grants from NIDCD (00308) to R. E. Remez, and from NICHHD (01994) to Haskins Laboratories.
Rights and permissions
About this article
Cite this article
Remez, R.E., Rubin, P.E. On the perception of speech from time-varying acoustic information: Contributions of amplitude variation. Perception & Psychophysics 48, 313–325 (1990). https://doi.org/10.3758/BF03206682
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.3758/BF03206682