We study robust pitch synchronous parameters that are derived from envelope and instantaneous frequencies estimated via a bank of cochlear filters. Closed set Speaker Identification experiments are performed on the SPIDRE corpus with matched and mismatched handsets conditions. The recognizer is based on a hybrid Linear Vector Quantization and Single Layer Perceptron (LVQSLP). Experiments are reported with different codebook sizes. In mismatched condition, the Mel Frequency Cepstral Coefficients (MFCC) yield slightly better rating (68%) than Envelope (58%) and Instantaneous Frequency (65%) parameters when used independently. When the MFCC based recognizer is used in conjunction with the envelope based recognizer, the recognition rate increases to 80%. We also report identification rates based on two classes: women and men. In another experiment, listeners were asked to discriminate speakers on a subset of ten females. We discuss their performance. We also discuss the potential of the approach and of judicious combination of the parameters to improve Speaker Identification Systems.
Cite as: Ezzaidi, H., Rouat, J. (2000) Comparison of MFCC and pitch synchronous AM, FM parameters for speaker identification. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 2, 318-321, doi: 10.21437/ICSLP.2000-273
@inproceedings{ezzaidi00_icslp, author={Hassan Ezzaidi and Jean Rouat}, title={{Comparison of MFCC and pitch synchronous AM, FM parameters for speaker identification}}, year=2000, booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)}, pages={vol. 2, 318-321}, doi={10.21437/ICSLP.2000-273} }