This paper presents speaker diarization system on NIST Rich Transcription 2009 (RT-09) Meeting Recognition evaluation data set for the task of Single Distant Microphone (SDM). A two-step speaker clustering method is proposed. The first step is speaker cluster initialization using speech segments of meeting audio, where we randomly pick a small subset of speech segments and merge them iteratively into a number of clusters. And, the second step is cluster purification, where we introduce a consensus-based speaker segment selection method for efficient speaker cluster modeling that purifies the clusters. The system achieves a promising diarization error rate (DER) of 16.4%.
Cite as: Nwe, T.L., Sun, H., Ma, B., Li, H. (2010) Speaker diarization in meeting audio for single distant microphone. Proc. Interspeech 2010, 1505-1508, doi: 10.21437/Interspeech.2010-440
@inproceedings{nwe10_interspeech, author={Tin Lay Nwe and Hanwu Sun and Bin Ma and Haizhou Li}, title={{Speaker diarization in meeting audio for single distant microphone}}, year=2010, booktitle={Proc. Interspeech 2010}, pages={1505--1508}, doi={10.21437/Interspeech.2010-440} }