The role of temporal resolution in modulation-based speech segregation

May, Tobias; Bentsen, Thomas; Dau, Torsten

doi:10.21437/Interspeech.2015-78

The role of temporal resolution in modulation-based speech segregation

Tobias May, Thomas Bentsen, Torsten Dau

This study is concerned with the challenge of automatically segregating a target speech signal from interfering background noise. A computational speech segregation system is presented which exploits logarithmically-scaled amplitude modulation spectrogram (AMS) features to distinguish between speech and noise activity on the basis of individual time-frequency (T-F) units. One important parameter of the segregation system is the window duration of the analysis-synthesis stage, which determines the lower limit of modulation frequencies that can be represented but also the temporal acuity with which the segregation system can manipulate individual T-F units. To clarify the consequences of this trade-off on modulation-based speech segregation performance, the influence of the window duration was systematically investigated.

doi: 10.21437/Interspeech.2015-78

Cite as: May, T., Bentsen, T., Dau, T. (2015) The role of temporal resolution in modulation-based speech segregation. Proc. Interspeech 2015, 170-174, doi: 10.21437/Interspeech.2015-78

@inproceedings{may15_interspeech,
  author={Tobias May and Thomas Bentsen and Torsten Dau},
  title={{The role of temporal resolution in modulation-based speech segregation}},
  year=2015,
  booktitle={Proc. Interspeech 2015},
  pages={170--174},
  doi={10.21437/Interspeech.2015-78}
}