ISCA Archive Interspeech 2014
ISCA Archive Interspeech 2014

A data-driven approach to speech enhancement using Gaussian process

Sukanya Sonowal, Kisoo Kwon, Nam Soo Kim, Jong Won Shin

This paper presents a novel data-driven approach to single channel speech enhancement employing Gaussian process (GP). Our approach is based on applying GP regression to estimate the residual gain with the input features being the a priori and a posteriori signal-to-noise ratios (SNRs). The residual gain is defined as the difference between the optimal gain and that obtained from the minimum mean-square error log-spectral amplitude (MMSE-LSA) estimator. Our proposed approach involves a cascaded structure consisting of two stages. At the first stage, the gain of the MMSE-LSA estimator is calculated in conjunction with the SNR features. In the second stage, the residual gains are estimated through GP and they are used to further enhance the output of the MMSE-LSA module. Experimental results show that the proposed approach produced better speech quality than not only the MMSE-LSA enhancement module but also the other data-driven technique.


doi: 10.21437/Interspeech.2014-585

Cite as: Sonowal, S., Kwon, K., Kim, N.S., Shin, J.W. (2014) A data-driven approach to speech enhancement using Gaussian process. Proc. Interspeech 2014, 2847-2851, doi: 10.21437/Interspeech.2014-585

@inproceedings{sonowal14_interspeech,
  author={Sukanya Sonowal and Kisoo Kwon and Nam Soo Kim and Jong Won Shin},
  title={{A data-driven approach to speech enhancement using Gaussian process}},
  year=2014,
  booktitle={Proc. Interspeech 2014},
  pages={2847--2851},
  doi={10.21437/Interspeech.2014-585}
}