Selecting phonotactic features for language recognition

Tong, Rong; Ma, Bin; Li, Haizhou; Chng, Eng Siong

doi:10.21437/Interspeech.2010-273

Selecting phonotactic features for language recognition

Rong Tong, Bin Ma, Haizhou Li, Eng Siong Chng

This paper studies feature selection in phonotactic language recognition. The phonotactic feature is presented by n-gram statistics derived from one or more phone recognizers in the form of high dimensional feature vectors. Two feature selection strategies are proposed to select the n-gram statistics for reducing the dimension of feature vectors, so that higher order n-gram features can be adopted in language recognition. With the proposed feature selection techniques, we achieved equal error rates (EERs) of 1.84% with 4-gram statistics on the 2007 NIST Language Recognition Evaluation 30s closed test sets.

doi: 10.21437/Interspeech.2010-273

Cite as: Tong, R., Ma, B., Li, H., Chng, E.S. (2010) Selecting phonotactic features for language recognition. Proc. Interspeech 2010, 737-740, doi: 10.21437/Interspeech.2010-273

@inproceedings{tong10_interspeech,
  author={Rong Tong and Bin Ma and Haizhou Li and Eng Siong Chng},
  title={{Selecting phonotactic features for language recognition}},
  year=2010,
  booktitle={Proc. Interspeech 2010},
  pages={737--740},
  doi={10.21437/Interspeech.2010-273}
}