ISCA Archive Eurospeech 2003
ISCA Archive Eurospeech 2003

Robust multiple resolution analysis for automatic speech recognition

Roberto Gemello, Franco Mana, Dario Albesano, Renato De Mori

This paper investigates the potential of exploiting the redundancy implicit in Multi Resolution Analysis (MRA) for Automatic Speech Recognition (ASR) systems. Experiments, carried with data collected from home telephones and in cars, confirm the proposed approach for exploiting this redundancy.

Comparisons with the use of Mel Frequency-scaled Cepstral Coefficients (MFCC)s, JRASTA Perceptual Linear Prediction Coefficients (JRASTAPLP) indicate that executing Principal Component Analysis (PCA) on MRA features result in performance superior to the use of MFCCs and competitive with the use of JRASTAPLP features. Experiments in noisy conditions, using the Italian component of the AURORA3 corpus, show a WER reduction of 15.7% when SNR-dependent Spectral Subtraction (SS) is performed on MRA-PCA features compared to when it is performed on JRASTAPLP features. Furthermore, SS appears to be better than Soft Thresholding (ST).


doi: 10.21437/Eurospeech.2003-533

Cite as: Gemello, R., Mana, F., Albesano, D., Mori, R.D. (2003) Robust multiple resolution analysis for automatic speech recognition. Proc. 8th European Conference on Speech Communication and Technology (Eurospeech 2003), 3033-3036, doi: 10.21437/Eurospeech.2003-533

@inproceedings{gemello03_eurospeech,
  author={Roberto Gemello and Franco Mana and Dario Albesano and Renato De Mori},
  title={{Robust multiple resolution analysis for automatic speech recognition}},
  year=2003,
  booktitle={Proc. 8th European Conference on Speech Communication and Technology (Eurospeech 2003)},
  pages={3033--3036},
  doi={10.21437/Eurospeech.2003-533}
}