Closed-Set Device-Independent Speaker Identification Using CNN

Chakraborty, Tapas; Barai, Bidhan; Chatterjee, Bikshan; Das, Nibaran; Basu, Subhadip; Nasipuri, Mita

doi:10.1007/978-981-15-1084-7_28

Tapas Chakraborty¹⁸,
Bidhan Barai¹⁸,
Bikshan Chatterjee¹⁸,
Nibaran Das¹⁸,
Subhadip Basu¹⁸ &
…
Mita Nasipuri¹⁸

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1034))

Included in the following conference series:

International Conference on Intelligent Computing and Communication

610 Accesses
3 Citations

Abstract

Speaker Identification(SI) has numerous applications in real world. Traditional classifiers like Gaussian Mixture Models (GMM), Support Vector Machine (SVM), and Hidden Markov Models (HMM) were used earlier for SI. Features like Mel Frequency Cepstral Coefficient (MFCC), and Gammatone Frequency Cepstral Coefficients (GFCC) need to be generated first. But these approaches do not perform well when audio data captured through multiple devices and recorded in different environments, i.e., in mismatch condition. Whereas Machine Learning (ML) algorithms usually provide better accuracy, and hence became more popular. Restricted Boltzmann Machine(RBM), Long-Short-Term Memory (LSTM), and Convolutional neural network (CNN) are some of the ML approaches applied on SI. In this paper, CNN is used for automatic feature extraction and speaker classification on IITG-MV noisy dataset. CNN performs better than GMM, specially for device mismatch case.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Barai, B., Das, D., Das, N., Basu, S., Nasipuri, M.: An ASR System Using MFCC and VQ/GMM with Emphasis on Environmental Dependency (2018)
Google Scholar
Barai, B., Das, D., Das, N., Basu, S., Nasipuri, M.: Closed-set text-independent automatic speaker recognition system using VQ/GMM. In: Intelligent Engineering Informatics, pp. 337–346. Springer (2018)
Google Scholar
Campbell, W.M., Sturim, D.E., Reynolds, D.A.: Support vector machines using GMM supervectors for speaker verification. IEEE Signal Proc. Lett. 13(5), 308–311 (2006)
Article Google Scholar
Dieleman, S., Schrauwen, B.: End-to-end learning for music audio. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6964–6968. IEEE (2014)
Google Scholar
Ghahabi, O., Hernando, J.: Restricted boltzmann machines for vector representation of speech in speaker recognition. Comput. Speech Lang. 47, 16–29 (2018)
Article Google Scholar
Gu, J., Wang, Z., Kuen, J., Ma, L., Shahroudy, A., Shuai, B., Liu, T., Wang, X., Wang, G., Cai, J., et al.: Recent advances in convolutional neural networks. Pattern Recognit. 77, 354–377 (2018)
Article Google Scholar
Haris, B., Pradhan, G., Misra, A., Shukla, S., Sinha, R., Prasanna, S.: Multi-variability speech database for robust speaker recognition. In: 2011 National Conference on Communications (NCC), pp. 1–5. IEEE (2011)
Google Scholar
Jumelle, M., Sakmeche, T.: Speaker clustering with neural networks and audio processing, arXiv preprint arXiv:1803.08276 (2018)
Madikeri, S., Bourlard, H.: KL-HMM based speaker diarization system for meetings. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4435–4439. IEEE (2015)
Google Scholar
McFee, B., Raffel, C., Liang, D., Ellis, D.P., McVicar, M., Battenberg, E., Nieto, O.: librosa: Audio and Music Signal Analysis in Python (2015)
Google Scholar

Download references

Acknowledgements

This project is partially supported by the CMATER laboratory of the Computer Science and Engineering Department, Jadavpur University, India, TEQIP-II, PURSE-II, and UPE-II projects of Govt. of India. Subhadip Basu is partially supported by the Research Award (F.30-31/2016(SA-II)) from UGC, Government of India. This work is also supported by the project sponsored by SERB (Government of India, order no. SB/S3/EECE/054/2016) (dated 25/11/2016)

Author information

Authors and Affiliations

Jadavpur University, Kolkata, India
Tapas Chakraborty, Bidhan Barai, Bikshan Chatterjee, Nibaran Das, Subhadip Basu & Mita Nasipuri

Authors

Tapas Chakraborty
View author publications
You can also search for this author in PubMed Google Scholar
Bidhan Barai
View author publications
You can also search for this author in PubMed Google Scholar
Bikshan Chatterjee
View author publications
You can also search for this author in PubMed Google Scholar
Nibaran Das
View author publications
You can also search for this author in PubMed Google Scholar
Subhadip Basu
View author publications
You can also search for this author in PubMed Google Scholar
Mita Nasipuri
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tapas Chakraborty .

Editor information

Editors and Affiliations

Department of Electronics and Communication Engineering, Shri Ramswaroop Memorial Group of Professional Colleges (SRMGPC), Lucknow, Uttar Pradesh, India
Vikrant Bhateja
School of Computer Engineering, Kalinga Institute of Industrial Technology (KIIT), Bhubaneswar, Odisha, India
Suresh Chandra Satapathy
Department of Informatics, University of Leicester, Leicester, UK
Yu-Dong Zhang
Department of MCA, J. S. S. Science and Technology University, Mysuru, India
V. N. Manjunath Aradhya

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chakraborty, T., Barai, B., Chatterjee, B., Das, N., Basu, S., Nasipuri, M. (2020). Closed-Set Device-Independent Speaker Identification Using CNN. In: Bhateja, V., Satapathy, S., Zhang, YD., Aradhya, V. (eds) Intelligent Computing and Communication. ICICC 2019. Advances in Intelligent Systems and Computing, vol 1034. Springer, Singapore. https://doi.org/10.1007/978-981-15-1084-7_28

Download citation

DOI: https://doi.org/10.1007/978-981-15-1084-7_28
Published: 18 February 2020
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1083-0
Online ISBN: 978-981-15-1084-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics