ISCA Archive Interspeech 2014
ISCA Archive Interspeech 2014

Automatic language identification using long short-term memory recurrent neural networks

Javier Gonzalez-Dominguez, Ignacio Lopez-Moreno, Haşim Sak, Joaquin Gonzalez-Rodriguez, Pedro J. Moreno

This work explores the use of Long Short-Term Memory (LSTM) recurrent neural networks (RNNs) for automatic language identification (LID). The use of RNNs is motivated by their better ability in modeling sequences with respect to feed forward networks used in previous works. We show that LSTM RNNs can effectively exploit temporal dependencies in acoustic data, learning relevant features for language discrimination purposes. The proposed approach is compared to baseline i-vector and feed forward Deep Neural Network (DNN) systems in the NIST Language Recognition Evaluation 2009 dataset. We show LSTM RNNs achieve better performance than our best DNN system with an order of magnitude fewer parameters. Further, the combination of the different systems leads to significant performance improvements (up to 28%).


doi: 10.21437/Interspeech.2014-483

Cite as: Gonzalez-Dominguez, J., Lopez-Moreno, I., Sak, H., Gonzalez-Rodriguez, J., Moreno, P.J. (2014) Automatic language identification using long short-term memory recurrent neural networks. Proc. Interspeech 2014, 2155-2159, doi: 10.21437/Interspeech.2014-483

@inproceedings{gonzalezdominguez14_interspeech,
  author={Javier Gonzalez-Dominguez and Ignacio Lopez-Moreno and Haşim Sak and Joaquin Gonzalez-Rodriguez and Pedro J. Moreno},
  title={{Automatic language identification using long short-term memory recurrent neural networks}},
  year=2014,
  booktitle={Proc. Interspeech 2014},
  pages={2155--2159},
  doi={10.21437/Interspeech.2014-483}
}