Your browser does not support JavaScript!
http://iet.metastore.ingenta.com
1887

Joint quantisation strategies for low bit-rate sinusoidal coding

Joint quantisation strategies for low bit-rate sinusoidal coding

For access to this article, please select a purchase option:

Buy article PDF
£12.50
(plus tax if applicable)
Buy Knowledge Pack
10 articles for £75.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
IET Signal Processing — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Although there are speech coding standards producing high-quality speech above 4 kbps, below that transparent quality has not been achieved yet. There is still room for improvement at lower bit rates, especially at 2.4 kbps and below, which is an area of interest for military and security applications. Strategies for achieving high-quality speech using sinusoidal coding at very low bit rates are discussed. Previous work in the literature on combining several frames in a metaframe and performing variable bit allocation within the metaframe is extended. Experiments have been carried out to find an optimum metaframe size compromise between delay and quantisation gains. Metaframe classification and quantisation according to the metaframe class are used for better efficiency. A method for voicing determination from the linear prediction coefficient (LPC) shape is also presented. The proposed techniques have been applied to the SB-LPC vocoder to produce speech at 1.2 and 0.8 kbps, and compared to the original SB-LPC vocoder at 2.4/1.2 kbps as well as an established standard (Mixed Excitation Linear Predictive - MELP - vocoder) at 2.4/1.2/0.6 kbps in a listening test. It has been found that the proposed techniques have been effective in reducing the bit rate while not compromising the speech quality.

References

    1. 1)
      • U. Bhaskar , K. Swaminathan . Low bit-rate voice compression based on frequency domain interpolative techniques. IEEE Trans. Audio Speech and Signal Process. , 2 , 558 - 576
    2. 2)
      • Tzeng, F.F.: `Analysis-by-synthesis linear predictive speech coding at 2.4 kbit/s', Proc. Globecom'89, November 1989, 2, p. 1253–1257.
    3. 3)
      • de Lamare, R.C., Alcaim, A.: `Strategies to improve the performance of very low bit rate speech coders and application to a variable rate 1.2 kbps codec', Proc. IEE Vis. Image Signal Process., February 2005, 152, p. 74–86.
    4. 4)
      • A.S. Spanias . Speech coding: a tutorial review. Proc. IEEE , 10 , 1541 - 1582
    5. 5)
      • McAulay, R.J., Quatieri, T.F.: `Speech analysis/synthesis based on a sinusoidal representation', IEEE Trans. Acous. Speech, Signal Process., 1986, 4, p. 744–754.
    6. 6)
      • Yong, M., Davidson, G., Gersho, A.: `Encoding of LPC spectral parameters usign switched-adaptive interframe vector prediction', Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, April 1988, 1, p. 402–405.
    7. 7)
      • Wang, S., Gersho, A.: `Improved phonetically-segmented vector excitation coding at 3.4 kb/s', IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, March 1992, 1, p. 349–352.
    8. 8)
      • R. Hagen , E. Paksoy , A. Gersho . Voicing-specific LPC quantisation for variable-rate speech coding. IEEE Trans. Speech Audio Process. , 5 , 485 - 493
    9. 9)
      • McCree, A., Brady, K., Quatieri, T.: `Multisensor very low bit rate speech coding using segment quantisation', Proc. IEEE Int. Conf. on Acoustics, Speech, Signal Process., 2008, p. 3997–4000.
    10. 10)
      • Soong, F., Juang, B.: `Line spectrum pair (LSP) and speech data compression', Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing, March 1984, 9, p. 37–40, part 1.
    11. 11)
      • Parry, J.J., Burnett, I.S., Chicharo, J.F.: `The use of LSF-based phonetic classification in low-rate coder design', Proc. IEEE Workshop on Speech Coding, June 1999, p. 49–51.
    12. 12)
      • F. Beritelli , S. Casale , S. Serrano . Adaptive V/UV speech detection based on acoustic noise estimation and classification. Electron. Lett. , 4 , 249 - 251
    13. 13)
    14. 14)
      • T. Quatieri . (2002) Discrete-time speech signal processing.
    15. 15)
      • Gournay, P., Chartier, F.: `A 1200 bits/s HSX speech coder for very-low-bit-rate communications', IEEE Workshop on Signal Processing Systems, October 1998, p. 347–355.
    16. 16)
      • 3GPP2 C.S0014-B: ‘Enhanced variable rate codec, speech service option 3 and 68 for wideband spread spectrum digital systems’, May 2006.
    17. 17)
      • D.W. Griffin , J.S. Lim . Multiband excitation vocoder. IEEE Trans. Acous. Speech, Signal Process. , 8 , 1223 - 1235
    18. 18)
      • Wang, T., Koishida, K., Cuperman, V., Gersho, A., Collura, J.S.: `A 1200/2400 bps coding suite based on MELP', Proc. IEEE Workshop on Speech Coding, October 2002, p. 90–92.
    19. 19)
      • A.M. Kondoz . (1994) Digital speech.
    20. 20)
      • K.K. Paliwal , B.S. Atal . Efficient vector quantisation of LPC parameters at 24 bits/frame. IEEE Trans. Acoust. Speech, Signal Proc. , 3 - 14
    21. 21)
      • Supplee, L.M., Cohn, R.P., Collura, J.S., McCree, A.V.: `MELP: the new Federal Standard at 2400 bps', Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, April 1997, 2, p. 1591–1594.
    22. 22)
      • P. Jancovic , M. Kokuer . Estimation of voicing-character of speech spectra based on spectral shape. IEEE Signal Process. Lett. , 1 , 66 - 69
    23. 23)
      • Guilmin, G., Gournay, P., Chartier, F.: `Description of the French NATO candidate', Proc. IEEE Workshop on Speech Coding, October 2002, p. 84–86.
    24. 24)
      • Villette, S.: `Sinusoidal speech coding for low and very low bit rate applications', 2001, PhD, University of Surrey.
    25. 25)
      • Campbell, J., Tremain, T.: `Voiced/unvoiced classification of speech with applications to the U.S. government LPC-10E algorithm', Proc. IEEE Int. Conf. on Acoustics, Speech, Signal Processing, April 1986, 11, p. 473–476.
    26. 26)
    27. 27)
      • Guilmin, G., Capman, F., Ravera, B., Chartier, F.: `New Nato Stanag narrow band voice coder at 600 bits/s', Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, May 2006, 1.
    28. 28)
      • Paksoy, E., McCree, A., Viswanathan, V.: `A variable-rate multimodal speech coder with gain-matched analysis-by-synthesis', IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, April 1997, 2, p. 751–754.
    29. 29)
      • Swaminathan, K., Nandkumar, S., Bhaskar, U.: `A robust low rate voice codec for wireless communications', IEEE Workshop on Speech Coding For Telecommunications Proceeding, September 1997, p. 75–76.
    30. 30)
      • Das, A., Gersho, A.: `A variable-rate natural-quality parametric speech coder', IEEE Int. Conf. on Com., SUPERCOMM/ICC'94, May 1994, 1, p. 216–220.
    31. 31)
      • W.C. Chu . Embedded quantisation of line spectral frequencies using a multistage tree-structured vector quantiser. IEEE Trans. Audio Speech and Language Process. , 4 , 1205 - 1217
    32. 32)
      • International Telecommunication Union: ‘ITU-T recommendation P.800: methods for subjective determination of transmission quality’, August 1996, http://www.itu.int/rec/T-REC-P.800-199608-I/en, last accessed March 2009.
    33. 33)
      • Villette, S., Cho, Y.D., Kondoz, A.M.: `Efficient parameter quantisation for 2.4/1.2 kb/s split-band LPC coding', Proc. IEEE Workshop on Speech Coding, September 2000, p. 32–34.
    34. 34)
      • http://www.ntt-at.com/products_e/speech/index.html, accessed March 2009.
    35. 35)
      • Schroeder, M., Atal, B.: `Code-excited linear prediction(CELP): high-quality speech at very low bit rates', Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, April 1985, 10, p. 937–940.
    36. 36)
      • V. Grancharov , J.H. Plasberg , J. Samuelsson , W.B. Kleinj . Generalized postfilter for speech quality enchancement. IEEE Trans. Audio Speech Language Process. , 1 , 57 - 64
    37. 37)
      • Villette, S., Al Naimi, K.T., Sturt, C., Kondoz, A.M., Palaz, H.: `A 2.4/1.2 kbps SB-LPC based speech coder: the Turkish NATO STANAG candidate', Proc. IEEE Workshop on Speech Coding, October 2002, p. 87–89.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-spr.2009.0077
Loading

Related content

content/journals/10.1049/iet-spr.2009.0077
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address