Multilingual tandem bottleneck feature for language identification

Geng, Wang; Li, Jie; Zhang, Shanshan; Cai, Xinyuan; Xu, Bo

doi:10.21437/Interspeech.2015-166

Multilingual tandem bottleneck feature for language identification

Wang Geng, Jie Li, Shanshan Zhang, Xinyuan Cai, Bo Xu

The deep bottleneck (BN) feature based ivector solution has been recognized as a popular pipeline for language identification (LID) recently. However, issues such as how to extract more effective BN features and how to fully utilize features extracted from deep neural networks (DNN) are still not well investigated. In this paper, these issues are empirically tackled by means as follows: First, two novel types of deep features, phone-discriminant and triphone-discriminate are extracted. Then, DNNs are trained both separately and jointly on multilingual corpuses to produce different BN features. Finally, tandem fashion on deep BN features is applied to build enhanced deep features. Experiment results show that systems built on top of tandem deep features obtain 19% and 42% relative equal error rate reduction on average on NIST LRE 2007 over the counterpart built on traditional deep BN features and the cepstral feature based LID system, respectively.

doi: 10.21437/Interspeech.2015-166

Cite as: Geng, W., Li, J., Zhang, S., Cai, X., Xu, B. (2015) Multilingual tandem bottleneck feature for language identification. Proc. Interspeech 2015, 413-417, doi: 10.21437/Interspeech.2015-166

@inproceedings{geng15_interspeech,
  author={Wang Geng and Jie Li and Shanshan Zhang and Xinyuan Cai and Bo Xu},
  title={{Multilingual tandem bottleneck feature for language identification}},
  year=2015,
  booktitle={Proc. Interspeech 2015},
  pages={413--417},
  doi={10.21437/Interspeech.2015-166}
}