Distinctive feature fusion for recognition of australian English consonants

Lewis, Trent W.; Powers, David M. W.

doi:10.21437/Interspeech.2008-662

Distinctive feature fusion for recognition of australian English consonants

Trent W. Lewis, David M. W. Powers

Audio-Visual Automatic Speech Recognition offers to make speech recognition possible in noisy environments. Early and late fusion approaches dominate the field but may ignore linguistically relevant features. Distinctive features offer an alternative unit for fusion and research has shown that this is feasible on subsets of phonemes [1]. This paper outlines two extended models, multiclass and binary, and results suggest that it is possible to achieve a 20dB gain over audio-only recognition in low SNR environments.

T. Lewis and D. Powers, "Distinctive feature fusion for improved audio-visual phoneme recognition," in The Eighth International Symposium on Signal Proocessing and Its Applications, A. Bouzerdoum and A. Beghdadi, Eds. Sydney, Australia: IEEE, 2005.

doi: 10.21437/Interspeech.2008-662

Cite as: Lewis, T.W., Powers, D.M.W. (2008) Distinctive feature fusion for recognition of australian English consonants. Proc. Interspeech 2008, 2671-2674, doi: 10.21437/Interspeech.2008-662

@inproceedings{lewis08_interspeech,
  author={Trent W. Lewis and David M. W. Powers},
  title={{Distinctive feature fusion for recognition of australian English consonants}},
  year=2008,
  booktitle={Proc. Interspeech 2008},
  pages={2671--2674},
  doi={10.21437/Interspeech.2008-662}
}