Comparison of MFCC and pitch synchronous AM, FM parameters for speaker identification

Ezzaidi, Hassan; Rouat, Jean

doi:10.21437/ICSLP.2000-273

Comparison of MFCC and pitch synchronous AM, FM parameters for speaker identification

Hassan Ezzaidi, Jean Rouat

We study robust pitch synchronous parameters that are derived from envelope and instantaneous frequencies estimated via a bank of cochlear filters. Closed set Speaker Identification experiments are performed on the SPIDRE corpus with matched and mismatched handsets conditions. The recognizer is based on a hybrid Linear Vector Quantization and Single Layer Perceptron (LVQSLP). Experiments are reported with different codebook sizes. In mismatched condition, the Mel Frequency Cepstral Coefficients (MFCC) yield slightly better rating (68%) than Envelope (58%) and Instantaneous Frequency (65%) parameters when used independently. When the MFCC based recognizer is used in conjunction with the envelope based recognizer, the recognition rate increases to 80%. We also report identification rates based on two classes: women and men. In another experiment, listeners were asked to discriminate speakers on a subset of ten females. We discuss their performance. We also discuss the potential of the approach and of judicious combination of the parameters to improve Speaker Identification Systems.

doi: 10.21437/ICSLP.2000-273

Cite as: Ezzaidi, H., Rouat, J. (2000) Comparison of MFCC and pitch synchronous AM, FM parameters for speaker identification. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 2, 318-321, doi: 10.21437/ICSLP.2000-273

@inproceedings{ezzaidi00_icslp,
  author={Hassan Ezzaidi and Jean Rouat},
  title={{Comparison of MFCC and pitch synchronous AM, FM parameters for speaker identification}},
  year=2000,
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 2, 318-321},
  doi={10.21437/ICSLP.2000-273}
}