In this paper a number of alternative pre-processing configurations are applied to an HMM-based phoneme recognition system and evaluated on the TIMIT speech corpus. It is demonstrated that there is considerable advantage in the addition of processing steps after the initial signal processing. F-ratio analysis gives a clear ranking of the discriminatory power of commonly used features such as log-power, zero-crossing rate, cepstral, delta cepstral and band-power coefficients. Results have been obtained that demonstrate a 20% reduction in the mis-classification rate using a linear discriminant analysis transformation from a 43-variable feature set to a 10-variable linearly transformed feature set. Finally the paper demonstrates that vector quantisation using totally non-parametric classification trees can lead to phoneme classification results competitive with those achieved using traditional techniques, while at the same time offering much faster evaluation.
Cite as: Tridgell, A., Millar, B., Do, K.-A. (1992) Alternative preprocessing techniques for discrete hidden Markov model phoneme recognition. Proc. 2nd International Conference on Spoken Language Processing (ICSLP 1992), 631-634, doi: 10.21437/ICSLP.1992-210
@inproceedings{tridgell92_icslp, author={Andrew Tridgell and Bruce Millar and Kim-Anh Do}, title={{Alternative preprocessing techniques for discrete hidden Markov model phoneme recognition}}, year=1992, booktitle={Proc. 2nd International Conference on Spoken Language Processing (ICSLP 1992)}, pages={631--634}, doi={10.21437/ICSLP.1992-210} }