Skip to main content
Log in

Efficient implementation techniques of an SVM-based speech/music classifier in SMV

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

For real-time speech and audio encoders used in various multimedia applications, low-complexity encoding algorithms are required. Indeed, accurate classification of input signals is the key prerequisite for variable bit rate encoding, which has been introduced in order to effectively utilize limited communication bandwidth. This paper investigates implementation issues with a support vector machine (SVM)-based speech/music classifier in the selectable mode vocoder (SMV) framework, which is a standard codec adopted by the Third-Generation Partnership Project 2 (3GPP2). While a support vector machine is well known for its superior classification capability, it is accompanied by a high computational cost. In order to achieve a more realizable system, we propose two techniques for the SVM-based speech/music classifier, aimed at reducing the number of classification requests to the classifier. The first technique introduces a simpler classifier that processes some of the input frames instead of the SVM-based classifier, and the second technique skips a portion of input frames based on strong inter-frame correlation in speech and music frames. Our experimental results show that the proposed techniques can reduce the computational cost of the SVM-based classifier by 95.4 % with negligible performance degradation, making it plausible for integration into the SMV codec.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. 3GPP2 Specification (2004) Selectable Mode Vocoder (SMV) service option for wideband spread spectrum communication systems. 3GPP2-C.S0030-0, v3.0

  2. Burges C (1996) Simplified support vector decision rules. In: Proceedings of IEEE international conference on machine learning. Bari, Italy, pp 71–77

  3. Burger D, Austin TM (1997) The simplescalar tool set, version 2.0, Tech Rep 1342. University of Wisconsin-Madison, Computer Sciences Department

  4. CSR (2006) BlueCore5 Multimedia. http://www.csr.com/products/16/bluecore5-multimedia. Accessed 28 June 2013

  5. Dardas NH, Silva JM, Saddik AE (2012) Target-shooting exergame with a hand gesture control. Multimed Tools Appl doi:10.1007/s11042-012-1236-4

  6. Farrugia RA, Debono CJ (2012) A support vector machine approach for detection and localization of transmission errors within standard H.263++ decoders. IEEE Trans Multimed 11(7):1323–1330

    Article  Google Scholar 

  7. Fisher WM, Doddington GR, Goudie-Marshall KM (1986) The DARPA speech recognition research database: specifications and status. In: Proceedings of DARPA workshop speech recognition, pp 93–99

  8. Gao Y, Shlomot E, Benyassine A, Hyssen J, Su H, Murgia C (2001) The SMV algorithm selected by TIA and 3GPP2 for CDMA applications. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing. Salt Lake City, pp 709–712

  9. Ho T (2005) An efficient method for simplifying support vector machines. In: Proceedings of international conference on machine learning. Bonn, pp 617–624

  10. Hu H, Li Y, Liu M, Liang W (2012) Classification of defects in steel strip surface based on multiclass support vector machine. Multimed Tools Appl. doi:10.1007/s11042-012-1248-0

  11. Kim SK, Chang JH (2009) Speech/music classification enhancement for 3GPP2 SMV codec based on support vector machine. IEICE Trans Fundam Electron Commun Comput Sci E92-A(2):630–632

    Article  MathSciNet  Google Scholar 

  12. Kim SK, Chang JH (2010) Discriminative weight training for support vector machine-based speech/music classification in 3GPP2 SMV codec. IEICE Trans Fundam Electron Commun Comput Sci E93-A(1):316–319

    Article  MathSciNet  Google Scholar 

  13. Lavner Y, Ruinskiy D (2009) A decision-tree-based algorithm for speech/music classification and segmentation. EURASIP J Audio Speech Music Process 2009:1–14

  14. Maitre X (1988) 7 KHz audio coding within 64 kbit/s. IEEE J Sel Areas Commun 6(2):283–298

    Article  Google Scholar 

  15. Nakashima Y, Babaguchi N, Fan J (2012) Intended human object detection for automatically protecting privacy in mobile video surveillance. Multimedia Systems 18(2):157–173

    Article  Google Scholar 

  16. Song J, An H, Song Y, Choi S, Jeong D, Lee S (2011) Enhancement of speech/music decision employing GMM for SMV codec. In: Proceedings of international congressional image and signal processing, pp 2182-2185

  17. Vapnik VN (1999) An overview of statistical learning theory. IEEE Trans Neural Netw 10(5):988–999

    Article  Google Scholar 

  18. Zhan Y (2005) Design efficient support vector machine for fast classification. Pattern Recog 38(1):157–161

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joon-Hyuk Chang.

Additional information

This work was supported by NRF of Korea grant funded by the MEST (2012R1A2A2A01004895) and this research was supported by the MSIP, Korea, under the ITRC support program supervised by the NIPA (NIPA-2013-H0301-13-4005)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lim, C., Chang, JH. Efficient implementation techniques of an SVM-based speech/music classifier in SMV. Multimed Tools Appl 74, 5375–5400 (2015). https://doi.org/10.1007/s11042-014-1859-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-014-1859-8

Keywords

Navigation