Feature selection for the classification of crosstalk in multi-channel audio

Wrigley, Stuart N.; Brown, Guy J.; Wan, Vincent; Renals, Steve

doi:10.21437/Eurospeech.2003-172

Feature selection for the classification of crosstalk in multi-channel audio

Stuart N. Wrigley, Guy J. Brown, Vincent Wan, Steve Renals

An extension to the conventional speech / nonspeech classification framework is presented for a scenario in which a number of microphones record the activity of speakers present at a meeting (one microphone per speaker). Since each microphone can receive speech from both the participant wearing the microphone (local speech) and other participants (crosstalk), the recorded audio can be broadly classified in four ways: local speech, crosstalk plus local speech, crosstalk alone and silence. We describe a classifier in which a Gaussian mixture model (GMM) is used to model each class. A large set of potential acoustic features are considered, some of which have been employed in previous speech / nonspeech classifiers. A combination of two feature selection algorithms is used to identify the optimal feature set for each class. Results from the GMM classifier using the selected features are superior to those of a previously published approach.

doi: 10.21437/Eurospeech.2003-172

Cite as: Wrigley, S.N., Brown, G.J., Wan, V., Renals, S. (2003) Feature selection for the classification of crosstalk in multi-channel audio. Proc. 8th European Conference on Speech Communication and Technology (Eurospeech 2003), 469-472, doi: 10.21437/Eurospeech.2003-172

@inproceedings{wrigley03_eurospeech,
  author={Stuart N. Wrigley and Guy J. Brown and Vincent Wan and Steve Renals},
  title={{Feature selection for the classification of crosstalk in multi-channel audio}},
  year=2003,
  booktitle={Proc. 8th European Conference on Speech Communication and Technology (Eurospeech 2003)},
  pages={469--472},
  doi={10.21437/Eurospeech.2003-172}
}