This paper explores possible strategies for the recombination of independent multi-resolution sub-band based recognisers. The multi-resolution approach is based on the premise that additional cues for phonetic discrimination may exist in the spectral correlates of a particular sub-band, but not in another. Weights are derived via discriminative training using the 'Minimum Classification Error' (MCE) criterion on log-likelihood scores. Using this criterion the weights for correct and competing classes are adjusted in opposite directions, thus conveying the sense of enforcing separation of confusable classes. Discriminative re-combination is shown to provide significant increases for both phone classification and continuous recognition tasks on the TIMIT database. Weighted recombination of independent multi-resolution sub-band models is also shown to provide robustness improvements in broadband noise.
Cite as: McMahon, P., McCourt, P., Vaseghi, S. (1998) Discriminative weighting of multi-resolution sub-band cepstral features for speech recognition. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 0315, doi: 10.21437/ICSLP.1998-537
@inproceedings{mcmahon98_icslp, author={Philip McMahon and Paul McCourt and Saeed Vaseghi}, title={{Discriminative weighting of multi-resolution sub-band cepstral features for speech recognition}}, year=1998, booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)}, pages={paper 0315}, doi={10.21437/ICSLP.1998-537} }