ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

Speaker identification for whispered speech based on frequency warping and score competition

Xing Fan, John H. L. Hansen

In certain situations, talkers will intentionally use whisper instead of neutral speech for the sake of privacy or confidentiality, which severely degrades the performance of speaker identification systems trained with only neutral speech. There are considerable differences in the spectral structure between whisper and neutral speech due to an absence of voice harmonic excitation. This study introduces a new feature based on frequency warping and score competition for the task of speaker identification for whisper. The proposed feature method is evaluated on a corpus of male speakers in both neutral and whisper. Closed set speaker ID results show an absolute 27% improvement in accuracy when compared with a traditional MFCC feature based system. The result confirms a viable approach to improving speaker ID performance between neutral and whisper speech condition.


doi: 10.21437/Interspeech.2008-384

Cite as: Fan, X., Hansen, J.H.L. (2008) Speaker identification for whispered speech based on frequency warping and score competition. Proc. Interspeech 2008, 1313-1316, doi: 10.21437/Interspeech.2008-384

@inproceedings{fan08_interspeech,
  author={Xing Fan and John H. L. Hansen},
  title={{Speaker identification for whispered speech based on frequency warping and score competition}},
  year=2008,
  booktitle={Proc. Interspeech 2008},
  pages={1313--1316},
  doi={10.21437/Interspeech.2008-384}
}