Speaker diarization in meeting audio for single distant microphone

Nwe, Tin Lay; Sun, Hanwu; Ma, Bin; Li, Haizhou

doi:10.21437/Interspeech.2010-440

Speaker diarization in meeting audio for single distant microphone

Tin Lay Nwe, Hanwu Sun, Bin Ma, Haizhou Li

This paper presents speaker diarization system on NIST Rich Transcription 2009 (RT-09) Meeting Recognition evaluation data set for the task of Single Distant Microphone (SDM). A two-step speaker clustering method is proposed. The first step is speaker cluster initialization using speech segments of meeting audio, where we randomly pick a small subset of speech segments and merge them iteratively into a number of clusters. And, the second step is cluster purification, where we introduce a consensus-based speaker segment selection method for efficient speaker cluster modeling that purifies the clusters. The system achieves a promising diarization error rate (DER) of 16.4%.

doi: 10.21437/Interspeech.2010-440

Cite as: Nwe, T.L., Sun, H., Ma, B., Li, H. (2010) Speaker diarization in meeting audio for single distant microphone. Proc. Interspeech 2010, 1505-1508, doi: 10.21437/Interspeech.2010-440

@inproceedings{nwe10_interspeech,
  author={Tin Lay Nwe and Hanwu Sun and Bin Ma and Haizhou Li},
  title={{Speaker diarization in meeting audio for single distant microphone}},
  year=2010,
  booktitle={Proc. Interspeech 2010},
  pages={1505--1508},
  doi={10.21437/Interspeech.2010-440}
}