Efficient implementation techniques of an SVM-based speech/music classifier in SMV

Lim, Chungsoo; Chang, Joon-Hyuk

doi:10.1007/s11042-014-1859-8

Efficient implementation techniques of an SVM-based speech/music classifier in SMV

Published: 01 February 2014

Volume 74, pages 5375–5400, (2015)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Chungsoo Lim¹ &
Joon-Hyuk Chang²

293 Accesses
8 Citations
3 Altmetric
Explore all metrics

Abstract

For real-time speech and audio encoders used in various multimedia applications, low-complexity encoding algorithms are required. Indeed, accurate classification of input signals is the key prerequisite for variable bit rate encoding, which has been introduced in order to effectively utilize limited communication bandwidth. This paper investigates implementation issues with a support vector machine (SVM)-based speech/music classifier in the selectable mode vocoder (SMV) framework, which is a standard codec adopted by the Third-Generation Partnership Project 2 (3GPP2). While a support vector machine is well known for its superior classification capability, it is accompanied by a high computational cost. In order to achieve a more realizable system, we propose two techniques for the SVM-based speech/music classifier, aimed at reducing the number of classification requests to the classifier. The first technique introduces a simpler classifier that processes some of the input frames instead of the SVM-based classifier, and the second technique skips a portion of input frames based on strong inter-frame correlation in speech and music frames. Our experimental results show that the proposed techniques can reduce the computational cost of the SVM-based classifier by 95.4 % with negligible performance degradation, making it plausible for integration into the SMV codec.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic speech patterns recognition of commands using SVM and PSO

Article 24 July 2019

Robust phoneme classification for automatic speech recognition using hybrid features and an amalgamated learning model

Article 11 October 2016

Long-Term Multi-band Frequency-Domain Mean-Crossing Rate (FDMCR): A Novel Feature Extraction Algorithm for Speech/Music Discrimination

Article 09 July 2023

References

3GPP2 Specification (2004) Selectable Mode Vocoder (SMV) service option for wideband spread spectrum communication systems. 3GPP2-C.S0030-0, v3.0
Burges C (1996) Simplified support vector decision rules. In: Proceedings of IEEE international conference on machine learning. Bari, Italy, pp 71–77
Burger D, Austin TM (1997) The simplescalar tool set, version 2.0, Tech Rep 1342. University of Wisconsin-Madison, Computer Sciences Department
CSR (2006) BlueCore5 Multimedia. http://www.csr.com/products/16/bluecore5-multimedia. Accessed 28 June 2013
Dardas NH, Silva JM, Saddik AE (2012) Target-shooting exergame with a hand gesture control. Multimed Tools Appl doi:10.1007/s11042-012-1236-4
Farrugia RA, Debono CJ (2012) A support vector machine approach for detection and localization of transmission errors within standard H.263++ decoders. IEEE Trans Multimed 11(7):1323–1330
Article Google Scholar
Fisher WM, Doddington GR, Goudie-Marshall KM (1986) The DARPA speech recognition research database: specifications and status. In: Proceedings of DARPA workshop speech recognition, pp 93–99
Gao Y, Shlomot E, Benyassine A, Hyssen J, Su H, Murgia C (2001) The SMV algorithm selected by TIA and 3GPP2 for CDMA applications. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing. Salt Lake City, pp 709–712
Ho T (2005) An efficient method for simplifying support vector machines. In: Proceedings of international conference on machine learning. Bonn, pp 617–624
Hu H, Li Y, Liu M, Liang W (2012) Classification of defects in steel strip surface based on multiclass support vector machine. Multimed Tools Appl. doi:10.1007/s11042-012-1248-0
Kim SK, Chang JH (2009) Speech/music classification enhancement for 3GPP2 SMV codec based on support vector machine. IEICE Trans Fundam Electron Commun Comput Sci E92-A(2):630–632
Article MathSciNet Google Scholar
Kim SK, Chang JH (2010) Discriminative weight training for support vector machine-based speech/music classification in 3GPP2 SMV codec. IEICE Trans Fundam Electron Commun Comput Sci E93-A(1):316–319
Article MathSciNet Google Scholar
Lavner Y, Ruinskiy D (2009) A decision-tree-based algorithm for speech/music classification and segmentation. EURASIP J Audio Speech Music Process 2009:1–14
Maitre X (1988) 7 KHz audio coding within 64 kbit/s. IEEE J Sel Areas Commun 6(2):283–298
Article Google Scholar
Nakashima Y, Babaguchi N, Fan J (2012) Intended human object detection for automatically protecting privacy in mobile video surveillance. Multimedia Systems 18(2):157–173
Article Google Scholar
Song J, An H, Song Y, Choi S, Jeong D, Lee S (2011) Enhancement of speech/music decision employing GMM for SMV codec. In: Proceedings of international congressional image and signal processing, pp 2182-2185
Vapnik VN (1999) An overview of statistical learning theory. IEEE Trans Neural Netw 10(5):988–999
Article Google Scholar
Zhan Y (2005) Design efficient support vector machine for fast classification. Pattern Recog 38(1):157–161
Article Google Scholar

Download references

Author information

Authors and Affiliations

Korea National University of Transportation, 50 Daehak-ro, Choungju-si, Chungbuk, Republic of Korea
Chungsoo Lim
Hanyang University, 222 Wangsimni-ro, Seongdong, Seoul, Republic of Korea
Joon-Hyuk Chang

Authors

Chungsoo Lim
View author publications
You can also search for this author in PubMed Google Scholar
Joon-Hyuk Chang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Joon-Hyuk Chang.

Additional information

This work was supported by NRF of Korea grant funded by the MEST (2012R1A2A2A01004895) and this research was supported by the MSIP, Korea, under the ITRC support program supervised by the NIPA (NIPA-2013-H0301-13-4005)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lim, C., Chang, JH. Efficient implementation techniques of an SVM-based speech/music classifier in SMV. Multimed Tools Appl 74, 5375–5400 (2015). https://doi.org/10.1007/s11042-014-1859-8

Download citation

Published: 01 February 2014
Issue Date: August 2015
DOI: https://doi.org/10.1007/s11042-014-1859-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient implementation techniques of an SVM-based speech/music classifier in SMV

Abstract

Access this article

Similar content being viewed by others

Automatic speech patterns recognition of commands using SVM and PSO

Robust phoneme classification for automatic speech recognition using hybrid features and an amalgamated learning model

Long-Term Multi-band Frequency-Domain Mean-Crossing Rate (FDMCR): A Novel Feature Extraction Algorithm for Speech/Music Discrimination

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Efficient implementation techniques of an SVM-based speech/music classifier in SMV

Abstract

Access this article

Similar content being viewed by others

Automatic speech patterns recognition of commands using SVM and PSO

Robust phoneme classification for automatic speech recognition using hybrid features and an amalgamated learning model

Long-Term Multi-band Frequency-Domain Mean-Crossing Rate (FDMCR): A Novel Feature Extraction Algorithm for Speech/Music Discrimination

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation