Subband approach for automatic speaker recognition: Optimal division of the frequency domain

Besacier, Laurent; Bonastre, Jean-François

doi:10.1007/BFb0015996

Laurent Besacier^1,2 &
Jean-François Bonastre¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1206))

Included in the following conference series:

International Conference on Audio- and Video-Based Biometric Person Authentication

2420 Accesses
11 Citations

Abstract

This paper presents a new method for automatic speaker recognition. The principle is to split the whole spectral domain into partial frequency subbands on which recognizers are independently applied and then recombined to yield a global recognition decision. In this article, we particularly discuss the selection of the most critical subbands for the speaker recognition task and the choice of an optimal division of the frequency domain.

Speaker recognition experiments are conducted on different subbands for a 630 population on TIMIT using second-order statistical methods. Large differences in identification between subbands are observed. In particular, the low-frequency subbands (under 600Hz) and the high-frequency subbands (over 2000Hz) are more speaker specific than middle-frequency ones. An appropriate selection of the most critical subbands shows that very good performances are still obtained with only half of the frequency domain.

Finally experiments on different subband system architectures show that the correlations between frequency channels are of prime importance for the speaker recognition task. Some of these correlations are lost when the frequency domain is divided into subbands. Consequently efficient recombination procedures need to be investigated to perform enhanced speaker identification results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

BIMBOT, F., MAGRIN-CHAGNOLLEAU, Y., MATHAN, L., Second-order statistical methods for text-independent speaker Identification. Speech Communication, n∘17(1–2), pp 177–192, August 1995.
Google Scholar
BONASTRE, J.F., MELONI, H., Inter and intra-speaker variability of french phonemes; advantages of an explicit knowledge based approach. In Workshop on Automatic Speaker Recognition, pp 157–160, April 1994. Martigny (Switzerland).
Google Scholar
BOURLARD, H., DUPONT, S., A new ASR approach based on independent processing and recombination of partial frequency bands. In Proceedings ICSLP, October 1996. Philadelphia, USA.
Google Scholar
FISHER, W., ZUE, V., BERNSTEIN, J., PALLET, D., An acoustic-phonetic database. JASA, suppl. A, Vol. 81(S92). 1986.
Google Scholar
HOLLIEN, H., The acoustics of crime. Applied Psycholinguistics and Communication Disorders 1990. Plenum Press: New-York & London. 370p.
Google Scholar
MATSUI, T., FURUI, S., Text-independent speaker recognition using vocal tract and pitch information. In Proceedings ICSLP 90, pp 137–140, 1990.
Google Scholar
NOLAN, F., The phonetic bases of speaker recognition. CUP 1983. Cambridge.
Google Scholar
REYNOLDS, D.A., Speaker Identification and verification using gaussian mixture models. In Workshop on Automatic Speaker Recognition and Verification, pp 27–30, April 1994. Martigny (Switzerland).
Google Scholar
SAMBUR, M.R., Selection of acoustic features for speaker identification. In IEEE Transactions on ASSP. n∘23(2), pp 176–182, April 1975.
Google Scholar

Download references

Author information

Authors and Affiliations

Laboratoire d'Informatique d'Avignon, 339, chemin des Meinajaries, BP 1228, 84140, Avignon Cedex 9, France
Laurent Besacier & Jean-François Bonastre
Laboratoire Parole et Langage, Université de Provence, 29, av. Robert Schuman, 13621, Aix-en-Provence, France
Laurent Besacier

Authors

Laurent Besacier
View author publications
You can also search for this author in PubMed Google Scholar
Jean-François Bonastre
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Josef Bigün Gérard Chollet Gunilla Borgefors

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Besacier, L., Bonastre, JF. (1997). Subband approach for automatic speaker recognition: Optimal division of the frequency domain. In: Bigün, J., Chollet, G., Borgefors, G. (eds) Audio- and Video-based Biometric Person Authentication. AVBPA 1997. Lecture Notes in Computer Science, vol 1206. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0015996

Download citation

DOI: https://doi.org/10.1007/BFb0015996
Published: 10 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-62660-2
Online ISBN: 978-3-540-68425-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics