A phase-averaged model for the relationship between noisy speech, clean speech and noise in the log-mel domain

Faubel, Friedrich; McDonough, John; Klakow, Dietrich

doi:10.21437/Interspeech.2008-164

A phase-averaged model for the relationship between noisy speech, clean speech and noise in the log-mel domain

Friedrich Faubel, John McDonough, Dietrich Klakow

In this work, we demonstrate that the most widely-used model for the relationship between noisy speech, clean speech and noise in the log-Mel domain is inaccurate due to its disregard of the phase. Moreover, we show how a more exact model can be derived by averaging over the phase in the log-Mel domain, and how this can profitably be applied to particle filter based sequential noise compensation. Experimental results confirm the superiority of the phase-averaged model for both clean speech estimation in general and the particle filter in particular. Reductions in word error rate of up to 17% relative were obtained on a large vocabulary task.

doi: 10.21437/Interspeech.2008-164

Cite as: Faubel, F., McDonough, J., Klakow, D. (2008) A phase-averaged model for the relationship between noisy speech, clean speech and noise in the log-mel domain. Proc. Interspeech 2008, 553-556, doi: 10.21437/Interspeech.2008-164

@inproceedings{faubel08_interspeech,
  author={Friedrich Faubel and John McDonough and Dietrich Klakow},
  title={{A phase-averaged model for the relationship between noisy speech, clean speech and noise in the log-mel domain}},
  year=2008,
  booktitle={Proc. Interspeech 2008},
  pages={553--556},
  doi={10.21437/Interspeech.2008-164}
}