Abstract
Word recognition is generally assumed to be achieved via competition in the mental lexicon between phonetically similar word forms. However, this process has so far been examined only in the context of auditory phonetic similarity. In the present study, we investigated whether the influence of wordform similarity on word recognition holds in the visual modality and with the patterns of visual phonetic similarity. Deaf and hearing participants identified isolated spoken words presented visually on a video monitor. On the basis of computational modeling of the lexicon from visual confusion matrices of visual speech syllables, words were chosen to vary in visual phonetic distinctiveness, ranging from visually unambiguous (lexical equivalence class [LEC] size of 1) to highly confusable (LEC size greater than 10). Identification accuracy was found to be highly related to the word LEC size and frequency of occurrence in English. Deaf and hearing participants did not differ in their sensitivity to word LEC size and frequency. The results indicate that visual spoken word recognition shows strong similarities with its auditory counterpart in that the same dependencies on lexical similarity and word frequency are found to influence visual speech recognition accuracy. In particular, the results suggest that stimulus-based lexical distinctiveness is a valid construct to describe the underlying machinery of both visual and auditory spoken word recognition.
Article PDF
Similar content being viewed by others
References
Aldenderfer, M. S., &,Blashfield, R. K. (1984).Cluster analysis. Beverly Hills, CA: Sage.
Auer, E. T., Jr. (in press). The influence of the lexicon on speechread word recognition: Contrasting segmental and lexical distinctiveness.Psychonomic Bulletin & Review.
Auer, E. T., Jr., &Bernstein, L. E. (1997). Speechreading and the structure of the lexicon: Computationally modeling the effects of reduced phonetic distinctiveness on lexical uniqueness.Journal of the Acoustical Society of America,102, 3704–3710.
Auer, E. T., Jr., Bernstein, L. E., Waldstein, R. S., & Tucker, P. E. (1997). Effects of phonetic variation and the structure of the lexicon on the uniqueness of words. In C. Benoît & R. Campbell (Eds.),Proceedings of the ESCA/ESCOP workshop on audio-visual speech processing (pp. 21-24). Rhodes, Greece.
Balota, D. A., &Chumbley, J. I. (1984). Are lexical decisions a good measure of lexical access? The role of word frequency in the neglected decision stage.Journal of Experimental Psychology: Human Perception & Performance,10, 340–357.
Bard, E. G., &Shillcock, R. C. (1993). Competitor effects during lexical access: Chasing Zipf's tail. In G. T. M. Altmann & R. C. Shillcock (Eds.),Cognitive models of speech processing: The second Sperlonga meeting (pp. 235–275). Hove, U.K.: Erlbaum.
Berger, K. W. (1972). Visemes and homophenous words.Teacher of the Deaf,70, 396–399.
Bernstein, L. E., Demorest, M. E., &Eberhardt, S. P. (1994). A computational approach to analyzing sentential speech perception: Phoneme-to-phoneme stimulus-response alignment.Journal of the Acoustical Society of America,95, 3617–3622.
Bernstein, L. E., Demorest, M. E., &Tucker, P. E. (1998). What makes a good speechreader? First you have to find one. In R. Campbell, B. Dodd, & D. Burnham (Eds.),Hearing by eye: The psychology of speechreading and auditory-visual speech (pp. 211–228). Hove, U.K.: Psychology Press.
Bernstein, L. E., Demorest, M. E., &Tucker, P. E. (2000). Speech perception without hearing.Perception & Psychophysics,62, 233–252.
Bernstein, L. E., &Eberhardt, S. P. (1986).Johns Hopkins Lipreading Corpus I-II: Disc 1 [Videodisc]. Baltimore: Johns Hopkins University.
Bernstein, L. E., Iverson, P., &Auer, E. T., Jr. (1997). Elucidating the complex relationships between phonetic perception and word recognition in audiovisual speech perception. In C. Benoît & R. Campbell (Eds.),Proceedings of the ESCA/ESCOP workshop on audio-visual speech processing (pp. 21–24). Rhodes, Greece.
Breeuwer, M., &Plomp, R. (1985). Speechreading supplemented with formant-frequency information from voiced speech.Journal of the Acoustical Society of America,77, 314–317.
Broadbent, D. E. (1967). Word-frequency effect and response bias.Psychological Review,74, 1–15.
Cluff,M. S., &Luce, P. A. (1990). Similarity neighborhoods of spoken two syllable words: Retroactive effects on multiple activation.Journal of Experimental Psychology: Human Perception & Performance,16, 551–563.
Conrad, R. (1977). Lipreading by deaf and hearing children.British Journal of Educational Psychology,47, 60–65.
Davis, H., &Silverman, S. R. (1970).Hearing and deafness. New York: Holt, Rinehart & Winston.
Demorest, M. E., &Bernstein, L. E. (1992). Sources of variability in speechreading sentences: A generalizability analysis.Journal of Speech & Hearing Research,35, 876–891.
Dunn, L. M., &Dunn, L. M. (1981).Peabody picture vocabulary test-revised. Circle Pines, MN: American Guidance Service.
Fisher, C. G. (1968). Confusions among visually perceived consonants.Journal of Speech & Hearing Research,11, 796–804.
Forster, K. I. (1976). Accessing the mental lexicon. In R. J. Wales & E. Walker (Eds.),New approaches to language mechanisms (pp. 257–287). sterdam: rth Holland.
Forster, K. I. (1979). Levels of processing and the structure of the language processor. In W. E. Cooper & E. C. T. Walker (Eds.),Sentence processing: Psycholinguistic studies presented to Merrill Garrett (pp. 27–85). Hillsdale, NJ: Erlbaum.
Glanzer, M., &Ehrenreich, S. L. (1979). Structure and search of the internal lexicon.Journal of Verbal Learning & Verbal Behavior,18, 381–398.
Gordon, B. (1983). Lexical access and lexical decision: Mechanisms of frequency sensitivity.Journal of Verbal Learning & Verbal Behavior,22, 24–44.
Grant, K. W., &Braida, L. D. (1991). Evaluating the articulation index for auditory-visual input.Journal of the Acoustical Society of America,89, 2952–2960.
Grant, K. W., &Walden, B. E. (1996). Spectral distribution of prosodic information.Journal of Speech & Hearing Research,39, 228–238.
Grosjean, F., &Itzler, J. (1984). Can semantic constraint reduce the role of word frequency during spoken-word recognition?Bulletin of the Psychonomic Society,22, 180–182.
Howes, D. H. (1954). On the interpretation of word frequency as a variable affecting speech recognition.Journal of Experimental Psychology,48, 106–112.
Howes, D. H. (1957). On the relation between the intelligibility and frequency of occurrence of English words.Journal of the Acoustical Society of America,29, 296–305.
Howes, D. H., &Soloman, R. L. (1951). Visual duration threshold as a function of word probability.Journal of Experimental Psychology,41, 401–410.
Iverson, P., Bernstein, L. E., &Auer, E. T., Jr. (1998). Phonetic perception and word recognition.Speech Communication,26, 45–63.
Jackson, P. L. (1988). The theoretical minimal unit for visual speech perception: Visemes and coarticulation.Volta Review,90, 99–115.
Klatt, D. H. (1980). Speech perception: A model of acoustic-phonetic analysis and lexical access. In R. A. Cole (Ed.),Perception and production of fluent speech (pp. 243–288). Hillsdale, NJ: Erlbaum.
Kricos, P. B., &Lesner, S. A. (1982). Differences in visual intelligibility across talkers.Volta Review,84, 219–225.
Kricos, P. B., &Lesner, S. A. (1985). Effect of talker differences on the speechreading of hearing-impaired teenagers.Volta Review,87, 5–16.
Kučera, H., &Francis, W. (1967).Computational analysis of presentday American English. Providence, RI: Brown University Press.
Luce, P. A. (1986).Neighborhoods of words in the mental lexicon (Research on Speech Perception, Tech. Rep. No. 6). Bloomington: Indiana University, Department of Psychology, Speech Research Laboratory.
Luce, P. A., &Pisoni, D. B. (1998). Recognizing spoken words: The neighborhood activation model.Ear & Hearing,19, 1–36.
Luce, P. A., Pisoni, D. B., &Goldinger, S. D. (1990). Similarity neighborhoods of spoken words. In G. T.M. Altmann (Ed.),Cognitive models of speech processing (pp. 122–147). Cambridge, MA: MIT Press.
Luce, R. D. (1959).Individual choice behavior. New York: Wiley.
Marslen-Wilson, W. D. (1987). Functional parallelism in spoken wordrecognition.Cognition,25, 71–102.
Marslen-Wilson, W. D. (1993). Issues of process and representation in lexical access. In G. T.M. Altmann & R. C. Shillcock (Eds.),Cognitive models of speech processing: The second Sperlonga meeting (pp. 187–210). Hove, U.K.: Erlbaum.
Marslen-Wilson, W. D., &Warren, P. (1994). Levels of perceptual representation and process in lexical access: Words, phonemes, and features.Psychological Review,101, 653–675.
Marslen-Wilson, W. D., &Welsh, A. (1978). Processing interactions and lexical access during word-recognition in continuous speech.Cognitive Psychology,10, 29–63.
Massaro, D. W. (1998).Perceiving talking faces: From speech perception to a behavioral principle. Cambridge, MA: MIT Press.
McClelland, J. L., &Elman, J. L. (1986). The TRACE model of speech perception.Cognitive Psychology,18, 1–86.
McQueen, J. M., Cutler, A., Briscoe, T., &Norris, D. (1995). Models of continuous speech recognition and the contents of the vocabulary.Language & Cognitive Processes,10, 309–331.
Miller, G. A., &Johnson-Laird, P. N. (1976).Language and perception. Cambridge, MA: Harvard University Press.
Mogford, K. (1987). Lip-reading in the prelingually deaf. In B. Dodd & R. Campbell (Eds.),Hearing by eye: The psychology of lip-reading (pp. 191–211). Hillsdale, NJ: Erlbaum.
Montgomery, A. A., &Jackson, P. L. (1983). Physical characteristics of the lips underlying vowel lipreading performance.Journal of the Acoustical Society of America,73, 2134–2144.
Morton, J. (1969). Interaction of information in word recognition.Psychological Review,76, 165–178.
Morton, J. (1979). Word recognition. In J. Morton & J.D. Marshall (Eds.),Psycholinguistics: 2. Structures and processes (pp. 107–156). Cambridge, MA: MIT Press.
Nitchie, E. B. (1916). The use of homophenous words.Volta Review,18, 85–93.
Norris, D. (1994). Shortlist: A connectionist model of continuous word recognition.Cognition,52, 189–234.
Nusbaum, H. C., Pisoni, D. B., &Davis, C. K. (1984). Sizing up the Hoosier mental lexicon: Measuring the familiarity of 20,000 words.Research on speech perception (Progress Rep. No. 10). Bloomington: Indiana University, Department of Psychology.
Pelson, R. O., &Prather, W. F. (1974). Effects of visual message-related cues, age, and hearing impairment on speechreading performance.Journal of Speech & Hearing Research,17, 518–525.
Sankoff, D., &Kruskal, J. B. (1983).Time wraps, string edits, and macromolecules: The theory and practice of sequence comparison. Reading, MA: Addison-Wesley.
Savin, H. B. (1963). Word-frequency effect and errors in the perception of speech.Journal of the Acoustical Society of America,35, 200–206.
Seitz, P. F., Bernstein, L. E., Auer, E. T., Jr., &MacEachern, M. (1998).PhLex (Phonologically Transformable Lexicon): A 35,000-word pronouncing American English lexicon on structural principles, with accompanying phonological rules and word frequencies. Los Angeles: House Ear Institute.
Soloman, R. L., &Postman, L. (1952). Frequency of usage as a determinant of recognition thresholds for words.Journal of Experimental Psychology,43, 195–201.
Wang, M. D., &Bilger, R. C. (1973). Consonant confusions in noise: A study of perceptual features.Journal of the Acoustical Society of America,54, 1248–1266.
Author information
Authors and Affiliations
Corresponding authors
Additional information
This research was supported by Grant DC02107 from NIH/NIDCD under the direction of the PI, L.E.B.
Electronic supplementary material
Rights and permissions
About this article
Cite this article
Mattys, S.L., Bernstein, L.E. & Auer, E.T. Stimulus-based lexical distinctiveness as a general word-recognition mechanism. Perception & Psychophysics 64, 667–679 (2002). https://doi.org/10.3758/BF03194734
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.3758/BF03194734