Frequency modulation (FM) features are typically extracted using a filterbank, usually based on an auditory frequency scale, however there is psychophysical evidence to suggest that this scale may not be optimal for extracting speaker-specific information. In this paper, speaker-specific information in FM features is analyzed as a function of the filterbank structure at the feature, model and classification stages. Scatter matrix based separation measures at the feature level and Kullback-Leibler distance based measures at the model level are used to analyze the discriminative contributions of the different bands. Then a series of speaker recognition experiments are performed to study how each band of the FM feature contributes to speaker recognition. A new filter bank structure is proposed that attempts to maximize the speaker-specific information in the FM feature for telephone data. Finally, the distribution of speaker-specific information is analyzed for wideband speech.
Cite as: Thiruvaran, T., Ambikairajah, E., Epps, J. (2009) Analysis of band structures for speaker-specific information in FM feature extraction. Proc. Interspeech 2009, 1111-1114, doi: 10.21437/Interspeech.2009-39
@inproceedings{thiruvaran09_interspeech, author={Tharmarajah Thiruvaran and Eliathamby Ambikairajah and Julien Epps}, title={{Analysis of band structures for speaker-specific information in FM feature extraction}}, year=2009, booktitle={Proc. Interspeech 2009}, pages={1111--1114}, doi={10.21437/Interspeech.2009-39} }