Speech perception as a multilevel processing system

Krulee, Gilbert K.; Tondo, Debra K.; Wightman, Frederic L.

doi:10.1007/BF01067960

Speech perception as a multilevel processing system

Published: November 1983

Volume 12, pages 531–554, (1983)
Cite this article

Journal of Psycholinguistic Research Aims and scope Submit manuscript

Gilbert K. Krulee¹,
Debra K. Tondo¹ &
Frederic L. Wightman¹

57 Accesses
3 Citations
Explore all metrics

Abstract

This is the report of an experiment that emphasized how variability in voice characteristics will affect a listener's ability to process sentences. Consideration is also given to the description of a possible model for the processing of continuous speech. As part of the experiment, use was made of a number of speakers, each of whom recorded the same set of sentences. The recorded sentences were used in order to form a pair of experimental tapes. On one tape, sentences were arranged randomly such that the listener could predict neither the content of a sentence nor the voice of the speaker who had produced it. On a second tape, sentences were arranged in blocks by speakers such that there was uncertainty as to content but not with respect to voice. Four groups of subjects listened to these tapes, with each tape being processed under a low and high condition of background noise. In interpreting the data, emphasis was placed on the idea that the speech signal contains a variety of characteristics or features and that these features are processed by three interacting subsystems. One is a prosodic system, responsible for the segmentation of continuous speech into sentences, phrases, and words. It attempts to establish a context within which a second system, responsible for the processing of words and syllables, can operate. However, this pair of systems is speaker-dependent in that it makes use of features that need to be adjusted on the basis of an assessment of voice characteristics. Thus, the model also provides for the inclusion of a third system, responsible for the assessment of voice characteristics. In effect, this third subsystem is responsible for “training” the two primary systems in order to normalize for differences in individual voice characteristics. We had predicted that performance would be uniformly more accurate in the blocked condition than in the unblocked condition since one type of uncertainty would have been eliminated. The results confirm this hypothesis only in part since there is an interaction with variations in backgroud noise. However, a plausible explanation is offered for these findings and for a variety of related results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Adams, M. J. Failures to comprehend contents of processing in reading. In R. J. Spiro, B. C. Bruce, & W. F. Brewer (Eds.),Theoretical issues in reading comprehension. Hillsdale, New Jersey: Lawrence Erlbaum, 1980.
Google Scholar
Broadbent, D. E.Decision and stress. London: Academic Press, 1971.
Google Scholar
Bruner, J. S. & Potter, M. C. Interference in visual recognition.Science, 1964,144, 424–425.
Google Scholar
Cannon, M. W. A method of analysis and recognition for voiced vowels.IEEE Transactions on Audio and Electroacoustics, 1968,AU-16, 154–158.
Google Scholar
Cole, R. A. & Scott, B. Toward a theory of speech perception.Psychological Review, 1974,81, 348–374.
Google Scholar
Crystal, D.Prosodic systems and intonation in English. Cambridge, England: University Press, 1969.
Google Scholar
Crystal, D. & Quirk, R.Systems of prosodic and paralinguistic features in English. The Hague: Mouton, 1964.
Google Scholar
Davis, K. H., Biddulph, R., & Balashek, S. Automatic recognition of spoken digits.Journal of the Acoustical Society of America, 1952,24, 637–642.
Google Scholar
Firth, J. R.Progress in linguistics, 1934–1951. London: Oxford University press, 1957.
Google Scholar
Fry, D. B., & Denes, P. Experiments in mechanical speech recognition. In C. Cherry (Ed.),Information theory. London: Butterworths, 1956. Pp. 206–212.
Google Scholar
Gold, B. Machine recognition of hand-sent Morse code.IRE Transactions on Information Theory, 1959,IT-5, 17–24.
Google Scholar
Green, D. M., & Swets, J. A.Signal detection theory and psychophysics. New York: Wiley, 1966.
Google Scholar
Haberman, S. J.Analysis of qualitiative data (Vol. 1): Introductory topics. New York: Academic Press, 1978.
Google Scholar
Hill, D. R. Automatic speech recognition: A problem in machine intelligence. In M. L. Collins, and D. Michie (Eds.),Machine intelligence I. Edinburgh: Oliver and Boyd, 1967.
Google Scholar
Hill, D. R. Man's machine interaction using speech. In F. L. Alt & M. Rubinoff (Eds.),Advances in computers (Vol. 11). New York: Academic Press, 1971. Pp. 166–230.
Google Scholar
Huggins, A. W. F. On the perception of the temporal phenomena in speech.Journal of the Acoustical Society of America, 1972,51, 1279–1290.
Google Scholar
Huggins, A. W. F. Speech timing and intelligibility. In J. Requin (Ed.),Attention and performance, (Vol. 7). Hillsdale, New Jersey: Erlbaum, 1978, Pp. 279–297.
Google Scholar
Hyde, S. R. Automatic speech recognition: Critical survey of the literature. In E. E. David, Jr., & P. B. Denes (Eds.),Human communication: A unified view. New York: McGraw-Hill, 1972, Pp. 399–438.
Google Scholar
Jakobson, R., Fant, G. M., & Halle, M.Preliminaries to speech analysis: The distinctive features and their correlates. Cambridge, Acoustics Laboratory: M.I.T. Technical Report No. 13, 1952.
Google Scholar
Krulee, G. K., & Schwartz, H. R. Scanning processes and sentence recognition.Journal of Psycholinguistic Research, 1975,4 141–158.
Google Scholar
Ladefoged, P., & Broadbent, D. B. Information conveyed by vowels.Journal of the Acoustical Society of America, 1957,29, 98–104.
Google Scholar
Levinson, S. E., & Liberman, M. Y. Speech recognition in computers.Scientific American, 1981,244, 64–76.
Google Scholar
Lindgren, N. Machine recognition of human language.IEEE Spectrum, 1965,2, (3 & 4) 114–136, 44–59, 104–116.
Google Scholar
Malmberg, B.Phonetics. New York: Dover, 1963.
Google Scholar
Newell, A., Barnett, J., Forgie, J. W., Green, C., Klatt, D. H., Licklider, J. C. R., Munson, J., Reddy, D. R., & Woods, W. A..Speech understanding systems. Amsterdam: North-Holland, 1973.
Google Scholar
Otten, K. W. Approaches to the machine recognition of conversational speech. In F. L. Alt & M. Rubinoff (Eds.),Advances in computers (Vol. 11). New York: Academic Press, 1971. Pp. 127–163.
Google Scholar
Pierce, J. R. Whither speech recognition.Journal of the Acoustical Society of America, 1969,46, 1049–1051.
Google Scholar
Pisoni, D. B. Some current theoretical issues in speech perception.Cognition, 1981,10, 249–259.
Google Scholar
Pollack, I., & Pickett, J. M. Intelligibility of exerpts from fluent speech: Auditory vs. structural context.Journal of Verbal Learning and Verbal Behavior, 1964,3 79–84.
Google Scholar
Potter, M. C.. On perceptual recognition. In J. R. Bruner, R. R. Olver, & P. M. Greenfield (Eds.),Studies in cognitive growth. New York: Wiley, 1967.
Google Scholar
Reddy, D. R. Computer recognition of connected speech.Journal of the Acoustical Society of America, 1967,42, 329–347.
Google Scholar
Reddy, D. R. Speech recognition by machine: A review.Proceedings IEEE, 1976,64, 501–531.
Google Scholar
Shearne, J. N., & Leach, P. F. Some experiments with a single word recognition system.IEEE Transactions on Audio and Electroacoustics. 1968,AU-16, 256–261.
Google Scholar
Stevens, K. N., & House, A. S. Speech perception. In J. V. Tobias (Ed.),Foundation of modern auditory theory, (Vol. 2) New York: Academic Press, 1972.
Google Scholar
Thorndike, E. L.The teacher's word book. New York: Teacher's College, Columbia University, 1921.
Google Scholar
Wingfield, A., & Klein, J. F. Syntactic structure and acoustic pattern in speech perception.Perception and Psychophysics, 1971,9, 23–25.
Google Scholar
Wirrin, J., & Stubbs, H. L. Electronic binary selections system for phoneme classifications.Journal of the Accoustical Society of America, 1956,28, 1082–1091.
Google Scholar

Download references

Author information

Authors and Affiliations

Northwestern University, Evanston, Illinois
Gilbert K. Krulee, Debra K. Tondo & Frederic L. Wightman

Authors

Gilbert K. Krulee
View author publications
You can also search for this author in PubMed Google Scholar
Debra K. Tondo
View author publications
You can also search for this author in PubMed Google Scholar
Frederic L. Wightman
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Krulee, G.K., Tondo, D.K. & Wightman, F.L. Speech perception as a multilevel processing system. J Psycholinguist Res 12, 531–554 (1983). https://doi.org/10.1007/BF01067960

Download citation

Accepted: 15 December 1982
Issue Date: November 1983
DOI: https://doi.org/10.1007/BF01067960

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Speech perception as a multilevel processing system

Abstract

Access this article

Similar content being viewed by others

Long-standing problems in speech perception dissolve within an information-theoretic perspective

Regressive spectral assimilation bias in speech perception

Extracting Language Content from Speech Sounds: The Information Theoretic Approach

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Speech perception as a multilevel processing system

Abstract

Access this article

Similar content being viewed by others

Long-standing problems in speech perception dissolve within an information-theoretic perspective

Regressive spectral assimilation bias in speech perception

Extracting Language Content from Speech Sounds: The Information Theoretic Approach

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation