Differentiation of Speech and Song Using Occurrence Pattern of Delta Energy

Ghosal, Arijit; Yasmin, Ghazaala; Banerjee, Debanjan

doi:10.1007/978-981-13-0514-6_74

Arijit Ghosal¹⁹,
Ghazaala Yasmin²⁰ &
Debanjan Banerjee²¹

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 758))

830 Accesses

Abstract

Differentiation of speech and song from acoustic signal is a challenging issue. It is a significant part of automatic classification of audio. Most of the previous works have been done for classifying speech and non-speech, but comparatively less work has been done for differentiating speech and song. Mostly, frequency and perceptual domain features were common in those works. In this work, a small dimensional acoustic feature has been proposed. Speech differs from song due to the absence of instrumental part within it which is present in song and causes increase of energy for song signal compared to speech signal. Short-time energy (STE), an acoustic feature, can reflect this observation. For precise study of energy variation, features based on very small change of energy, Delta Energy, and co-occurrence matrix of it are considered. For classification purpose, some well-known classifiers have been employed. Experimental result has been compared with existing methodologies to reflect the efficiency of the proposed system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Classification of Speech and Song Using Co-occurrence-Based Approach

An Algorithm for Distinguishing Between Speech and Music

Speech and Non-speech Audio Files Discrimination Extracting Textural and Acoustic Features

References

Haralick, R.M., Shapiro, L.G.: In: Computer and Robot Vision, vol. 1 (1992)
Google Scholar
Gerhard, D.: Pitch-based acoustic feature analysis for the discrimination of speech and monophonic singing. Can. Acoust. 30(3), 152–153 (2002)
Google Scholar
Bugatti, A., Flammini, A., Migliorati, P.: Audio classification in speech and music: a comparison between a statistical and a neural approach. In: EURASIP J. Adv. Signal Process. 4 (2002)
Google Scholar
Gerhard, D.: Perceptual features for a fuzzy speech-song classification. In: IEEE International Conference on Acoustics Speech and Signal Processing, vol. 4, pp. 4160–4160 (2002)
Google Scholar
Gerhard, D.: Silence as a cue to rhythm in the analysis of speech and song. Can. Acoust. 31(3), 22–23 (2003)
Google Scholar
Tzanetakis, G.: Song-specific bootstrapping of singing voice structure. In: ICME’04 2004 IEEE International Conference on Multimedia and Expo, 2004, vol. 3, pp. 2027–2030. IEEE (2004)
Google Scholar
Lin, R.S., Chen, L.H.: A new approach for classification of generic audio data. Int. J. Pattern Recognit. Artif. Intell. 19(01), 63–78 (2005)
Article Google Scholar
Umbaugh, S.E.: Computer Imaging: Digital Image Analysis and Processing. CRC press (2005)
Google Scholar
Zhang, Y.G., Zhang, C.S.: Separation of music signals by harmonic structure modeling. In: Advances in Neural Information Processing Systems, pp. 1617–1624 (2006)
Google Scholar
Ruinskiy, D., Lavner, Y.: An effective algorithm for automatic detection and exact demarcation of breath sounds in speech and song signals. IEEE Trans. Audio Speech Lang. Process. 15(3), 838–850 (2007)
Article Google Scholar
Lavner, Y., Ruinskiy, D.: A decision-tree-based algorithm for speech/music classification and segmentation. EURASIP J. Audio Speech Music Process. 2009(1) (2009)
Google Scholar
Gallardo-Antolín, A., Montero, J.M.: Histogram equalization-based features for speech, music, and song discrimination. IEEE Signal Process. Lett. 17(7), 659–662 (2010)
Article Google Scholar
Salselas, I., Herrera, P.: Music and speech in early development: automatic analysis and classification of prosodic features from two Portuguese variants. J. Portuguese Linguist. 10(1) (2011)
Google Scholar
Sonnleitner, R., Niedermayer, B., Widmer, G., Schlüter, J.: A simple and effective spectral feature for speech detection in mixed audio signals. In: Proceedings of the 15th International Conference on Digital Audio Effects (2012)
Google Scholar
Bhavsar, H., Panchal, M.H.: A review on support vector machine for data classification. Int. J. Adv. Res. Comput. Eng. Technol. (IJARCET), 1(10), 185 (2012)
Google Scholar
Velayatipour, M., Mosleh, M.: A review on speech-music discrimination methods. Int. J. Comput. Sci. Netw. Solut. 2(2), 67–78 (2014)
Google Scholar
Ramalingam, T., Dhanalakshmi, P.: Speech/music classification using wavelet based feature extraction techniques. J. Comput. Sci. 10(1), 34 (2014)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Technology, St. Thomas’ College of Engineering and Technology, Kolkata, 700023, West Bengal, India
Arijit Ghosal
Department of Computer Science and Engineering, St. Thomas’ College of Engineering and Technology, Kolkata, 700023, West Bengal, India
Ghazaala Yasmin
Department of Management Information Systems, Sarva Siksha Mission Kolkata, Kolkata, 700042, West Bengal, India
Debanjan Banerjee

Authors

Arijit Ghosal
View author publications
You can also search for this author in PubMed Google Scholar
Ghazaala Yasmin
View author publications
You can also search for this author in PubMed Google Scholar
Debanjan Banerjee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Arijit Ghosal .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, Sri Sivani College of Engineering, Srikakulam, Andhra Pradesh, India
Janmenjoy Nayak
Machine Intelligence Research Labs (MIR Labs), Scientific Network for Innovation and Research Excellence, Washington, USA
Ajith Abraham
Department of Mechanical Engineering, Sri Sivani College of Engineering, Srikakulam, Andhra Pradesh, India
B. Murali Krishna
Department of Electrical and Electronics Engineering, Sri Sivani College of Engineering, Srikakulam, Andhra Pradesh, India
G. T. Chandra Sekhar
Department of Computer Science and Technology, Indian Institute of Engineering Science and Technology (IIEST), Shibpur, Howrah, West Bengal, India
Asit Kumar Das

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ghosal, A., Yasmin, G., Banerjee, D. (2019). Differentiation of Speech and Song Using Occurrence Pattern of Delta Energy. In: Nayak, J., Abraham, A., Krishna, B., Chandra Sekhar, G., Das, A. (eds) Soft Computing in Data Analytics . Advances in Intelligent Systems and Computing, vol 758. Springer, Singapore. https://doi.org/10.1007/978-981-13-0514-6_74

Download citation

DOI: https://doi.org/10.1007/978-981-13-0514-6_74
Published: 22 August 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-0513-9
Online ISBN: 978-981-13-0514-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Differentiation of Speech and Song Using Occurrence Pattern of Delta Energy

Abstract

Access this chapter

Similar content being viewed by others

Classification of Speech and Song Using Co-occurrence-Based Approach

An Algorithm for Distinguishing Between Speech and Music

Speech and Non-speech Audio Files Discrimination Extracting Textural and Acoustic Features

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Differentiation of Speech and Song Using Occurrence Pattern of Delta Energy

Abstract

Access this chapter

Similar content being viewed by others

Classification of Speech and Song Using Co-occurrence-Based Approach

An Algorithm for Distinguishing Between Speech and Music

Speech and Non-speech Audio Files Discrimination Extracting Textural and Acoustic Features

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation