ISCA Archive Eurospeech 2003
ISCA Archive Eurospeech 2003

Use of a CSP-based voice activity detector for distant-talking ASR

Luca Armani, Marco Matassoni, Maurizio Omologo, Piergiorgio Svaizer

This paper addresses the problem of voice activity detection for distant-talking speech recognition in noisy and reverberant environment. The proposed algorithm is based on the same Cross-power Spectrum Phase analysis that is used for talker location and tracking purposes. A normalized feature is derived, which is shown to be more effective than an energy-based one. The algorithm exploits that feature by dynamically updating the threshold as a non-linear average value computed during the preceding pause. Given a real multichannel database, recorded with the speaker at 2.5 meter distance from the microphones, experiments show that the proposed algorithm provides a relevant relative error rate reduction.


doi: 10.21437/Eurospeech.2003-180

Cite as: Armani, L., Matassoni, M., Omologo, M., Svaizer, P. (2003) Use of a CSP-based voice activity detector for distant-talking ASR. Proc. 8th European Conference on Speech Communication and Technology (Eurospeech 2003), 501-504, doi: 10.21437/Eurospeech.2003-180

@inproceedings{armani03_eurospeech,
  author={Luca Armani and Marco Matassoni and Maurizio Omologo and Piergiorgio Svaizer},
  title={{Use of a CSP-based voice activity detector for distant-talking ASR}},
  year=2003,
  booktitle={Proc. 8th European Conference on Speech Communication and Technology (Eurospeech 2003)},
  pages={501--504},
  doi={10.21437/Eurospeech.2003-180}
}