Enhancement of noisy speech based on a custom thresholding function with a statistically determined threshold

Sanam, Tahsina Farah; Shahnaz, Celia

doi:10.1007/s10772-012-9144-6

Enhancement of noisy speech based on a custom thresholding function with a statistically determined threshold

Published: 28 April 2012

Volume 15, pages 463–475, (2012)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Tahsina Farah Sanam¹ &
Celia Shahnaz¹

285 Accesses
10 Citations
Explore all metrics

Abstract

Performance of the thresholding based speech enhancement methods largely depend on the estimate of the exact threshold value as well as on the choice of the thresholding function. In this paper, a speech enhancement method is presented, in which a custom thresholding function is proposed and employed upon the Wavelet Packet (WP) coefficients of the noisy speech. The thresholding function is capable of switching between modified hard and semisoft thresholding functions depending on a parameter that decides the signal characteristics under consideration. Here, the threshold is determined based on the statistical modeling of the Teager energy operated WP coefficients of the noisy speech. Extensive simulations indicate that the threshold thus obtained in conjunction with the custom thresholding function is very effective in reduction of not only the white noise but also the color noise from the noisy speech thus resulting in an enhanced speech with better quality and intelligibility. Several standard objective measures and subjective evaluations including informal listening tests show that the proposed method outperforms the recent state-of-the-art thresholding based approaches of noisy speech enhancement from high to low levels of SNR.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Brief review of image denoising techniques

Article Open access 08 July 2019

Review of wavelet denoising algorithms

Article 03 April 2023

A Strategic Approach for Robust Dysarthric Speech Recognition

Article 20 April 2024

References

Almajai, I., & Milner, B. (2011). Visually derived wiener filters for speech enhancement. IEEE Transactions on Audio, Speech, and Language Processing, 19(6), 1642–1651.
Article Google Scholar
Bahoura, M., & Rouat, J. (2001). A new approach for wavelet speech enhancement. In EUROSPEECH (pp. 1937–1940).
Google Scholar
Chang, J.-H. (2005). Warped discrete cosine transform-based noisy speech enhancement. IEEE Transactions on Circuits and Systems. II, Express Briefs, 52, 535–539.
Article Google Scholar
Chang, J.-H. (2007). Complex Laplacian probability density function for noisy speech enhancement. IEICE Electronics Express, 4, 245–250.
Article Google Scholar
Chang, S., Kwon, Y., Yang, S.-I., & Kim, I.-J. (2002). Speech enhancement for non-stationary noise environment by adaptive wavelet packet. In Proc. IEEE int. conf. acoustics, speech, and signal processing (ICASSP) (Vol. 1, pp. I-561–I-564).
Google Scholar
Chen, B., & Loizou, P. C. (2007). A Laplacian-based MMSE estimator for speech enhancement. Speech Communication, 49, 134–143.
Article Google Scholar
Donoho, D. (1995). De-noising by soft-thresholding. IEEE Transactions on Information Theory, 41, 613–627.
Article MathSciNet MATH Google Scholar
Ghanbari, Y., & Mollaei, M. R. K. (2006). A new approach for speech enhancement based on the adaptive thresholding of the wavelet packets. Speech Communication, 48(8), 927–940.
Article Google Scholar
Hirsch, H., & Pearce, D. (2000). The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions (ISCA ITRW ASR2000). Paris, France.
Hu, Y., & Loizou, P. (2004). Speech enhancement based on wavelet thresholding the multitaper spectrum. IEEE Transactions on Speech and Audio Processing, 12, 59–67.
Article Google Scholar
Hu, Y., & Loizou, P. C. (2007). Subjective comparison and evaluation of speech enhancement algorithms. Speech Communication, 49, 588–601.
Article Google Scholar
Jabloun, F., & Champagne, B. (2003). Incorporating the human hearing properties in the signal subspace approach for speech enhancement. IEEE Transactions on Speech and Audio Processing, 11, 700–708.
Article Google Scholar
Johnson, M. T., Yuan, X., & Ren, Y. (2007). Speech signal enhancement through adaptive wavelet thresholding. Speech Communication, 2007, 123–133.
Article Google Scholar
Kaiser, J. (1993). Some useful properties of teager’s energy operators. In Proc. IEEE int. conf. speech, and signal processing (ICASSP) (Vol. 3, pp. 149–152).
Google Scholar
Kim, N. S., & Chang, J.-H. (2000). Spectral enhancement based on global soft decision. Signal Processing Letters, 7, 108–110.
Article Google Scholar
O’Shaughnessy, D. (2000). Speech enhancement: theory and practice. New York: IEEE Press.
Google Scholar
Sameti, H., Sheikhzadeh, H., Deng, L., & Brennan, R. (1998). HMM-based strategies for enhancement of speech signals embedded in nonstationary noise. IEEE Transactions on Speech and Audio Processing, 6(5), 445–455.
Article Google Scholar
Shao, Y., & Chang, C.-H. (2007). A generalized time-frequency subtraction method for robust speech enhancement based on wavelet filter banks modeling of human auditory system. IEEE Transactions on Systems, Man, and Cybernetics, 37(4), 877–889.
Article Google Scholar
Sheikhzadeh, H., & Abutalebi, H. R. (2001). An improved wavelet-based speech enhancement system. In EUROSPEECH (pp. 1855–1858).
Google Scholar
Tabibian, S., Akbari, A., & Nasersharif, B. (2009). A new wavelet thresholding method for speech enhancement based on symmetric Kullback-Leibler divergence. In Computer conference, 2009. CSICC 2009. 14th international CSI (pp. 495–500).
Chapter Google Scholar
Varga, A., & Steeneken, H. J. M. (1993). Assessment for automatic speech recognition: Ii. noisex-92: a database and an experiment to study the effect of additive noise on speech recognition systems. Speech Communication, 12, 247–251.
Article Google Scholar
Yamashita, K., & Shimamura, T. (2005). Nonstationary noise estimation using low-frequency regions for spectral subtraction. Signal Processing Letters, 12, 465–468.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical and Electronic Engineering, Bangladesh University of Engineering and Technology, Dhaka, 1000, Bangladesh
Tahsina Farah Sanam & Celia Shahnaz

Authors

Tahsina Farah Sanam
View author publications
You can also search for this author in PubMed Google Scholar
Celia Shahnaz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Celia Shahnaz.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sanam, T.F., Shahnaz, C. Enhancement of noisy speech based on a custom thresholding function with a statistically determined threshold. Int J Speech Technol 15, 463–475 (2012). https://doi.org/10.1007/s10772-012-9144-6

Download citation

Received: 12 December 2011
Accepted: 06 April 2012
Published: 28 April 2012
Issue Date: December 2012
DOI: https://doi.org/10.1007/s10772-012-9144-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Enhancement of noisy speech based on a custom thresholding function with a statistically determined threshold

Abstract

Access this article

Similar content being viewed by others

Brief review of image denoising techniques

Review of wavelet denoising algorithms

A Strategic Approach for Robust Dysarthric Speech Recognition

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Enhancement of noisy speech based on a custom thresholding function with a statistically determined threshold

Abstract

Access this article

Similar content being viewed by others

Brief review of image denoising techniques

Review of wavelet denoising algorithms

A Strategic Approach for Robust Dysarthric Speech Recognition

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation