This paper studies feature selection in phonotactic language recognition. The phonotactic feature is presented by n-gram statistics derived from one or more phone recognizers in the form of high dimensional feature vectors. Two feature selection strategies are proposed to select the n-gram statistics for reducing the dimension of feature vectors, so that higher order n-gram features can be adopted in language recognition. With the proposed feature selection techniques, we achieved equal error rates (EERs) of 1.84% with 4-gram statistics on the 2007 NIST Language Recognition Evaluation 30s closed test sets.
Cite as: Tong, R., Ma, B., Li, H., Chng, E.S. (2010) Selecting phonotactic features for language recognition. Proc. Interspeech 2010, 737-740, doi: 10.21437/Interspeech.2010-273
@inproceedings{tong10_interspeech, author={Rong Tong and Bin Ma and Haizhou Li and Eng Siong Chng}, title={{Selecting phonotactic features for language recognition}}, year=2010, booktitle={Proc. Interspeech 2010}, pages={737--740}, doi={10.21437/Interspeech.2010-273} }