The most widely used technique to estimate the time-frequency representation of a discrete speech signal is the spectrogram. A technique based upon Cohen's class of generalized time frequency representations (TFR) is proposed herein and a technique for using this representation in a speech recognition system is described. The kernel design considerations used for analyzing speech signals and their rationale are detailed. Several well-known kernel functions as well as several novel kernel functions are studied. The TFR based coefficients are used to train and test the Apple Computer PlainTalk (TM) speech recognition system. A significant reduction in both the sentence and word error rates are shown.
Cite as: Fineberg, A.B., Yu, K.C. (1994) A time-frequency analysis technique for speech recognition signal processing. Proc. 3rd International Conference on Spoken Language Processing (ICSLP 1994), 1615-1618, doi: 10.21437/ICSLP.1994-418
@inproceedings{fineberg94_icslp, author={Adam B. Fineberg and Kevin C. Yu}, title={{A time-frequency analysis technique for speech recognition signal processing}}, year=1994, booktitle={Proc. 3rd International Conference on Spoken Language Processing (ICSLP 1994)}, pages={1615--1618}, doi={10.21437/ICSLP.1994-418} }