Abstract
This is the report of an experiment that emphasized how variability in voice characteristics will affect a listener's ability to process sentences. Consideration is also given to the description of a possible model for the processing of continuous speech. As part of the experiment, use was made of a number of speakers, each of whom recorded the same set of sentences. The recorded sentences were used in order to form a pair of experimental tapes. On one tape, sentences were arranged randomly such that the listener could predict neither the content of a sentence nor the voice of the speaker who had produced it. On a second tape, sentences were arranged in blocks by speakers such that there was uncertainty as to content but not with respect to voice. Four groups of subjects listened to these tapes, with each tape being processed under a low and high condition of background noise. In interpreting the data, emphasis was placed on the idea that the speech signal contains a variety of characteristics or features and that these features are processed by three interacting subsystems. One is a prosodic system, responsible for the segmentation of continuous speech into sentences, phrases, and words. It attempts to establish a context within which a second system, responsible for the processing of words and syllables, can operate. However, this pair of systems is speaker-dependent in that it makes use of features that need to be adjusted on the basis of an assessment of voice characteristics. Thus, the model also provides for the inclusion of a third system, responsible for the assessment of voice characteristics. In effect, this third subsystem is responsible for “training” the two primary systems in order to normalize for differences in individual voice characteristics. We had predicted that performance would be uniformly more accurate in the blocked condition than in the unblocked condition since one type of uncertainty would have been eliminated. The results confirm this hypothesis only in part since there is an interaction with variations in backgroud noise. However, a plausible explanation is offered for these findings and for a variety of related results.
Similar content being viewed by others
References
Adams, M. J. Failures to comprehend contents of processing in reading. In R. J. Spiro, B. C. Bruce, & W. F. Brewer (Eds.),Theoretical issues in reading comprehension. Hillsdale, New Jersey: Lawrence Erlbaum, 1980.
Broadbent, D. E.Decision and stress. London: Academic Press, 1971.
Bruner, J. S. & Potter, M. C. Interference in visual recognition.Science, 1964,144, 424–425.
Cannon, M. W. A method of analysis and recognition for voiced vowels.IEEE Transactions on Audio and Electroacoustics, 1968,AU-16, 154–158.
Cole, R. A. & Scott, B. Toward a theory of speech perception.Psychological Review, 1974,81, 348–374.
Crystal, D.Prosodic systems and intonation in English. Cambridge, England: University Press, 1969.
Crystal, D. & Quirk, R.Systems of prosodic and paralinguistic features in English. The Hague: Mouton, 1964.
Davis, K. H., Biddulph, R., & Balashek, S. Automatic recognition of spoken digits.Journal of the Acoustical Society of America, 1952,24, 637–642.
Firth, J. R.Progress in linguistics, 1934–1951. London: Oxford University press, 1957.
Fry, D. B., & Denes, P. Experiments in mechanical speech recognition. In C. Cherry (Ed.),Information theory. London: Butterworths, 1956. Pp. 206–212.
Gold, B. Machine recognition of hand-sent Morse code.IRE Transactions on Information Theory, 1959,IT-5, 17–24.
Green, D. M., & Swets, J. A.Signal detection theory and psychophysics. New York: Wiley, 1966.
Haberman, S. J.Analysis of qualitiative data (Vol. 1): Introductory topics. New York: Academic Press, 1978.
Hill, D. R. Automatic speech recognition: A problem in machine intelligence. In M. L. Collins, and D. Michie (Eds.),Machine intelligence I. Edinburgh: Oliver and Boyd, 1967.
Hill, D. R. Man's machine interaction using speech. In F. L. Alt & M. Rubinoff (Eds.),Advances in computers (Vol. 11). New York: Academic Press, 1971. Pp. 166–230.
Huggins, A. W. F. On the perception of the temporal phenomena in speech.Journal of the Acoustical Society of America, 1972,51, 1279–1290.
Huggins, A. W. F. Speech timing and intelligibility. In J. Requin (Ed.),Attention and performance, (Vol. 7). Hillsdale, New Jersey: Erlbaum, 1978, Pp. 279–297.
Hyde, S. R. Automatic speech recognition: Critical survey of the literature. In E. E. David, Jr., & P. B. Denes (Eds.),Human communication: A unified view. New York: McGraw-Hill, 1972, Pp. 399–438.
Jakobson, R., Fant, G. M., & Halle, M.Preliminaries to speech analysis: The distinctive features and their correlates. Cambridge, Acoustics Laboratory: M.I.T. Technical Report No. 13, 1952.
Krulee, G. K., & Schwartz, H. R. Scanning processes and sentence recognition.Journal of Psycholinguistic Research, 1975,4 141–158.
Ladefoged, P., & Broadbent, D. B. Information conveyed by vowels.Journal of the Acoustical Society of America, 1957,29, 98–104.
Levinson, S. E., & Liberman, M. Y. Speech recognition in computers.Scientific American, 1981,244, 64–76.
Lindgren, N. Machine recognition of human language.IEEE Spectrum, 1965,2, (3 & 4) 114–136, 44–59, 104–116.
Malmberg, B.Phonetics. New York: Dover, 1963.
Newell, A., Barnett, J., Forgie, J. W., Green, C., Klatt, D. H., Licklider, J. C. R., Munson, J., Reddy, D. R., & Woods, W. A..Speech understanding systems. Amsterdam: North-Holland, 1973.
Otten, K. W. Approaches to the machine recognition of conversational speech. In F. L. Alt & M. Rubinoff (Eds.),Advances in computers (Vol. 11). New York: Academic Press, 1971. Pp. 127–163.
Pierce, J. R. Whither speech recognition.Journal of the Acoustical Society of America, 1969,46, 1049–1051.
Pisoni, D. B. Some current theoretical issues in speech perception.Cognition, 1981,10, 249–259.
Pollack, I., & Pickett, J. M. Intelligibility of exerpts from fluent speech: Auditory vs. structural context.Journal of Verbal Learning and Verbal Behavior, 1964,3 79–84.
Potter, M. C.. On perceptual recognition. In J. R. Bruner, R. R. Olver, & P. M. Greenfield (Eds.),Studies in cognitive growth. New York: Wiley, 1967.
Reddy, D. R. Computer recognition of connected speech.Journal of the Acoustical Society of America, 1967,42, 329–347.
Reddy, D. R. Speech recognition by machine: A review.Proceedings IEEE, 1976,64, 501–531.
Shearne, J. N., & Leach, P. F. Some experiments with a single word recognition system.IEEE Transactions on Audio and Electroacoustics. 1968,AU-16, 256–261.
Stevens, K. N., & House, A. S. Speech perception. In J. V. Tobias (Ed.),Foundation of modern auditory theory, (Vol. 2) New York: Academic Press, 1972.
Thorndike, E. L.The teacher's word book. New York: Teacher's College, Columbia University, 1921.
Wingfield, A., & Klein, J. F. Syntactic structure and acoustic pattern in speech perception.Perception and Psychophysics, 1971,9, 23–25.
Wirrin, J., & Stubbs, H. L. Electronic binary selections system for phoneme classifications.Journal of the Accoustical Society of America, 1956,28, 1082–1091.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Krulee, G.K., Tondo, D.K. & Wightman, F.L. Speech perception as a multilevel processing system. J Psycholinguist Res 12, 531–554 (1983). https://doi.org/10.1007/BF01067960
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF01067960